This is the Third Notebook that we are presenting, In this notebook we are looking to do all of the suggestions given to us from the presentation and some of the next steps outlined in the presentation.

There are 3 big things that we want to do in this notebook. The first is we want to do is adjust the revenue and budget for the time value of money. After that we want to do feature selection using two different methods: recursive feature selection (brute force method using a library called boruta) and feature selection using ANOVA. We will be running this on two different data sets namely the NA one where we drop all NAs and the imputed data set. We are doing it on these data sets specifically because they were what we analyzed in the second notebook and saw interesting results there. There was a bit about ANOVA mentioned in the second notebook and all of that applies here as well. The reason that we are doing two feature selections is because we want to test the difference between them. We want to see what features they recommend in common and the differences. After which we will see which feature selection method performs better using error metrics. We would also ideally like to do multiple train/test splits as we know that the error rate heavily depends on the train/test split and give a huge range of 100% (MAPE) in some cases. We also recognize that this may not be possible computationally as it already take a significant amount of time to run the model once let alone running it multiple times.

We also did an ANOVA feature selection in notebook two but the reason we are doing it here is to compare feature selection models as Joe was interested in which one would perform better. Additionally the ANOVA feature selection in notebook two was on the dataset where the time value of money was not adjusted for, so we can see the impact that changing time value of money alone has on the results.

The first thing we must do as always is load all of the libraries that we will be using

library(readr)
package 㤼㸱readr㤼㸲 was built under R version 4.0.3
library(stringr)
package 㤼㸱stringr㤼㸲 was built under R version 4.0.3
library(tidyverse)
package 㤼㸱tidyverse㤼㸲 was built under R version 4.0.3Registered S3 methods overwritten by 'dbplyr':
  method         from
  print.tbl_lazy     
  print.tbl_sql      
-- Attaching packages ------------------------------------------------------------------- tidyverse 1.3.0 --
v ggplot2 3.3.2     v purrr   0.3.4
v tibble  3.0.4     v dplyr   1.0.2
v tidyr   1.1.2     v forcats 0.5.0
package 㤼㸱ggplot2㤼㸲 was built under R version 4.0.3package 㤼㸱tibble㤼㸲 was built under R version 4.0.3package 㤼㸱tidyr㤼㸲 was built under R version 4.0.3package 㤼㸱purrr㤼㸲 was built under R version 4.0.3package 㤼㸱dplyr㤼㸲 was built under R version 4.0.3package 㤼㸱forcats㤼㸲 was built under R version 4.0.3-- Conflicts ---------------------------------------------------------------------- tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()
library(dplyr)
library(mice)
package 㤼㸱mice㤼㸲 was built under R version 4.0.3
Attaching package: 㤼㸱mice㤼㸲

The following objects are masked from 㤼㸱package:base㤼㸲:

    cbind, rbind
library(VIM)
package 㤼㸱VIM㤼㸲 was built under R version 4.0.3Loading required package: colorspace
package 㤼㸱colorspace㤼㸲 was built under R version 4.0.3Loading required package: grid
Registered S3 method overwritten by 'data.table':
  method           from
  print.data.table     
VIM is ready to use.

Suggestions and bug-reports can be submitted at: https://github.com/statistikat/VIM/issues

Attaching package: 㤼㸱VIM㤼㸲

The following object is masked from 㤼㸱package:datasets㤼㸲:

    sleep
library(plyr)
package 㤼㸱plyr㤼㸲 was built under R version 4.0.3----------------------------------------------------------------------------------------------------------
You have loaded plyr after dplyr - this is likely to cause problems.
If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
library(plyr); library(dplyr)
----------------------------------------------------------------------------------------------------------

Attaching package: 㤼㸱plyr㤼㸲

The following objects are masked from 㤼㸱package:dplyr㤼㸲:

    arrange, count, desc, failwith, id, mutate, rename, summarise, summarize

The following object is masked from 㤼㸱package:purrr㤼㸲:

    compact
library(tidyr)
library(ggplot2)
library(sf)
package 㤼㸱sf㤼㸲 was built under R version 4.0.3Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(sjmisc)
package 㤼㸱sjmisc㤼㸲 was built under R version 4.0.3Learn more about sjmisc with 'browseVignettes("sjmisc")'.

Attaching package: 㤼㸱sjmisc㤼㸲

The following object is masked from 㤼㸱package:purrr㤼㸲:

    is_empty

The following object is masked from 㤼㸱package:tidyr㤼㸲:

    replace_na

The following object is masked from 㤼㸱package:tibble㤼㸲:

    add_case
library(highcharter)
package 㤼㸱highcharter㤼㸲 was built under R version 4.0.3Registered S3 method overwritten by 'htmlwidgets':
  method           from         
  print.htmlwidget tools:rstudio
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
library(openair)
package 㤼㸱openair㤼㸲 was built under R version 4.0.3
library(zoo)
package 㤼㸱zoo㤼㸲 was built under R version 4.0.3
Attaching package: 㤼㸱zoo㤼㸲

The following objects are masked from 㤼㸱package:base㤼㸲:

    as.Date, as.Date.numeric
library(countrycode)
package 㤼㸱countrycode㤼㸲 was built under R version 4.0.3
library(ggmap)
package 㤼㸱ggmap㤼㸲 was built under R version 4.0.3Google's Terms of Service: https://cloud.google.com/maps-platform/terms/.
Please cite ggmap if you use it! See citation("ggmap") for details.
library(blscrapeR) ##needed to get index for adjusting inflation
library(Boruta)
package 㤼㸱Boruta㤼㸲 was built under R version 4.0.3
library(randomForest)
package 㤼㸱randomForest㤼㸲 was built under R version 4.0.3randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.

Attaching package: 㤼㸱randomForest㤼㸲

The following object is masked from 㤼㸱package:dplyr㤼㸲:

    combine

The following object is masked from 㤼㸱package:ggplot2㤼㸲:

    margin
library(mlbench)
package 㤼㸱mlbench㤼㸲 was built under R version 4.0.3
library(Metrics)
package 㤼㸱Metrics㤼㸲 was built under R version 4.0.3

Work with the NAs csv

This is the csv where all the NAs are included and we will just drop all NA values

Read in the data that we want to work with

data_na = read.csv("allMerge_clean_withNA.csv")
head(data_na)

Do some basic class conversions

#converting classes 
data_na <- mutate_if(data_na, is.factor, as.character())
data_na$budget <- as.numeric(data_na$budget)
data_na$Total.Revenue = as.numeric(data_na$Total.Revenue)
head(data_na)

Adjust Revenue and Budget for Time Value of Money

Extract the Year from the release_date column and store it in a variable called year

data_na$Year = str_extract(data_na$release_date, "\\d{4}")
head(data_na)

Create a table that will give us the adjustment amount based on a base year of 2020

table = inflation_adjust(2020)
trying URL 'https://download.bls.gov/pub/time.series/cu/cu.data.1.AllItems'
Content type 'text/plain' length unknown
downloaded 2.3 MB
table

Create a data frame with the values we need from the table

table <- as.data.frame(table)
table$adj_value2 <- ((100 + table$pct_increase)/100)
df <- table[,c("year","adj_value2")]
colnames(df) = c("Year", "adj_value") #changing name for left_join
df

Merge the data frame from above with this data frame based on the year

data_na = left_join(df, data_na, by = 'Year') 
data_na

Convert adjusted_revenue and adjusted budget to an integer, we do this because there are a lot of decimals in some cases as the adjustments are very specific

data_na$adjusted_revenue = as.integer(data_na$Total.Revenue/data_na$adj_value)
NAs introduced by coercion to integer range
data_na$adjusted_budget = as.integer(data_na$budget/data_na$adj_value)
data_na

Drop all the columns that we will not be using

data_na = subset(data_na, select = -c(homepage, id, imdb_id, overview, poster_path, revenue, status, video, original_title, orginal_title_2, year_2, Year, adj_value, budget, Total.Revenue, title, X,production_countries, production_companies, tagline, spoken_languages, genres, cast, crew, belongs_to_collection, prod_comp_name, adult, original_language))

Also drop all the NA values from the dataset, leaving us with ~5000 data points

data_na <- drop_na(data_na)

Convert the variable types to factors

data_na$release_date <- as.Date(data_na$release_date)
data_na <- data_na %>% mutate_if(is.logical,as.factor)
data_na <- data_na %>% mutate_if(is.character,as.factor)
head(data_na) 

Resursive Feature Selection with Boruta on NA Dataset

It is very easy to run the boruta model, but can be time consuming in many cases, so we will maximize the runs at 100, by which almost all of the variables will be classified as important or unimportant.

featureSelection_na <- Boruta(adjusted_revenue ~ ., data = data_na, doTrace = 2, maxRuns = 100)
 1. run of importance source...
 2. run of importance source...
 3. run of importance source...
 4. run of importance source...
 5. run of importance source...
 6. run of importance source...
 7. run of importance source...
 8. run of importance source...
 9. run of importance source...
 10. run of importance source...
 11. run of importance source...
 12. run of importance source...
 13. run of importance source...
After 13 iterations, +3 mins: 
 confirmed 30 attributes: action, adjusted_budget, adventure, animation, comedy and 25 more;
 rejected 8 attributes: documentary, foreign, mgm, num_spoken_languages, tv_movie and 3 more;
 still have 14 attributes left.

 14. run of importance source...
 15. run of importance source...
 16. run of importance source...
 17. run of importance source...
After 17 iterations, +3.8 mins: 
 confirmed 2 attributes: fantasy, romance;
 rejected 3 attributes: mystery, new_line_cinema, rko_radio;
 still have 9 attributes left.

 18. run of importance source...
 19. run of importance source...
 20. run of importance source...
 21. run of importance source...
After 21 iterations, +4.6 mins: 
 confirmed 1 attribute: warner_bros;
 still have 8 attributes left.

 22. run of importance source...
 23. run of importance source...
 24. run of importance source...
After 24 iterations, +5.1 mins: 
 confirmed 1 attribute: production_country;
 rejected 1 attribute: music;
 still have 6 attributes left.

 25. run of importance source...
 26. run of importance source...
 27. run of importance source...
After 27 iterations, +5.6 mins: 
 rejected 1 attribute: history;
 still have 5 attributes left.

 28. run of importance source...
 29. run of importance source...
 30. run of importance source...
 31. run of importance source...
 32. run of importance source...
 33. run of importance source...
 34. run of importance source...
 35. run of importance source...
 36. run of importance source...
 37. run of importance source...
 38. run of importance source...
 39. run of importance source...
 40. run of importance source...
 41. run of importance source...
 42. run of importance source...
After 42 iterations, +8.7 mins: 
 confirmed 1 attribute: universal;
 still have 4 attributes left.

 43. run of importance source...
 44. run of importance source...
 45. run of importance source...
 46. run of importance source...
 47. run of importance source...
 48. run of importance source...
 49. run of importance source...
 50. run of importance source...
 51. run of importance source...
 52. run of importance source...
 53. run of importance source...
 54. run of importance source...
 55. run of importance source...
 56. run of importance source...
 57. run of importance source...
 58. run of importance source...
 59. run of importance source...
 60. run of importance source...
 61. run of importance source...
 62. run of importance source...
 63. run of importance source...
 64. run of importance source...
 65. run of importance source...
 66. run of importance source...
 67. run of importance source...
 68. run of importance source...
After 68 iterations, +14 mins: 
 rejected 1 attribute: columbia;
 still have 3 attributes left.

 69. run of importance source...
 70. run of importance source...
 71. run of importance source...
 72. run of importance source...
 73. run of importance source...
 74. run of importance source...
 75. run of importance source...
 76. run of importance source...
 77. run of importance source...
 78. run of importance source...
 79. run of importance source...
 80. run of importance source...
 81. run of importance source...
 82. run of importance source...
 83. run of importance source...
 84. run of importance source...
 85. run of importance source...
 86. run of importance source...
 87. run of importance source...
 88. run of importance source...
 89. run of importance source...
 90. run of importance source...
 91. run of importance source...
 92. run of importance source...
 93. run of importance source...
 94. run of importance source...
 95. run of importance source...
 96. run of importance source...
 97. run of importance source...
 98. run of importance source...
 99. run of importance source...

We can plot the feature selection that boruta returns to get more insight about the relevance of certain variables

plot(featureSelection_na, las = 2, cex.axis = 0.5)

We get a lot of interesting initial results telling us the adjusted_budget is the most relevant variable in predicting adjusted_revenue by far when compared to almost all other variables. is_in_collection and vote_count are also very important variable in predicting revenue and not far behind adjusted revenue. This intuitively makes a lot of sense that both of these variables should effect revenue. You would definitely expect budget and revenue to be positively correlated. You would also expect vote count to go up for “good movies”, and generally a sequel is made when the first movie does really well in terms of box office.

There are various variables (mainly the dummy variables created for production company and genres) that are deemed not important like mgm, music,rko_radio etc. Of all the unimportant variables only 1 is not a dummy variable and that is num_languages_spoken.

There are also 3 variables that are tentative, but we can see where they are based on the graph above, so we can choose what to do with them.

Interestingly the ranking of the variables is not the same when you run it multiple times. Based on the seed the ranking changes slightly, obviously adjusted_budget, is_in_collection and release_date were always the top 3 but between the genres it would differ sometimes. So we cannot definitively say that if a movie’s genre is adventure or family it has a bigger impact on revenue.

So we will fix the tentative variables (it will assign the tentative variables under important or unimportant using the information it already has) and we will get the formula that we will plug into the random forest model

featureSelectionFinal_na <- TentativeRoughFix(featureSelection_na)
getNonRejectedFormula(featureSelectionFinal_na)
adjusted_revenue ~ popularity + release_date + runtime + vote_average + 
    vote_count + meterScore + meterClass + is_in_collection + 
    has_tagline + number_of_cast + female_cast + male_cast + 
    unspecified_cast + number_of_crew + female_crew + male_crew + 
    unspecified_crew + comedy + horror + action + drama + fantasy + 
    thriller + animation + adventure + romance + family + twentieth_century + 
    warner_bros + universal + walt_disney + prod_size + num_production_companies + 
    production_country + adjusted_budget
<environment: 0x000002487937acb8>
featureSelection_na
Boruta performed 99 iterations in 19.76438 mins.
 35 attributes confirmed important: action, adjusted_budget, adventure, animation, comedy
and 30 more;
 14 attributes confirmed unimportant: columbia, documentary, foreign, history, mgm and 9
more;
 3 tentative attributes left: crime, paramount, science_fiction;

Feature Selection using ANOVA with the NA data set

Do a one way ANOVA of all the variables against adjusted_revenue

one.way_na <- aov(adjusted_revenue ~ ., data = data_na)
summary(one.way_na)
                           Df    Sum Sq   Mean Sq  F value   Pr(>F)    
popularity                  1 2.967e+19 2.967e+19 1760.028  < 2e-16 ***
release_date                1 4.130e+17 4.130e+17   24.498 7.71e-07 ***
runtime                     1 1.061e+19 1.061e+19  629.217  < 2e-16 ***
vote_average                1 2.752e+18 2.752e+18  163.241  < 2e-16 ***
vote_count                  1 8.270e+19 8.270e+19 4904.880  < 2e-16 ***
meterScore                  1 6.038e+16 6.038e+16    3.582 0.058488 .  
meterClass                  2 2.772e+17 1.386e+17    8.220 0.000273 ***
is_in_collection            1 7.698e+18 7.698e+18  456.566  < 2e-16 ***
num_spoken_languages        1 7.541e+15 7.541e+15    0.447 0.503653    
has_tagline                 1 1.109e+17 1.109e+17    6.577 0.010361 *  
number_of_cast              1 2.252e+16 2.252e+16    1.336 0.247882    
female_cast                 1 1.022e+17 1.022e+17    6.061 0.013854 *  
male_cast                   1 4.421e+17 4.421e+17   26.220 3.17e-07 ***
unspecified_cast            1 6.132e+15 6.132e+15    0.364 0.546494    
number_of_crew              1 1.847e+17 1.847e+17   10.955 0.000941 ***
female_crew                 1 7.784e+16 7.784e+16    4.617 0.031709 *  
male_crew                   1 2.262e+17 2.262e+17   13.417 0.000252 ***
comedy                      1 7.453e+17 7.453e+17   44.203 3.31e-11 ***
horror                      1 1.353e+18 1.353e+18   80.230  < 2e-16 ***
action                      1 1.497e+16 1.497e+16    0.888 0.346096    
drama                       1 7.573e+17 7.573e+17   44.917 2.31e-11 ***
documentary                 1 4.162e+15 4.162e+15    0.247 0.619336    
science_fiction             1 3.033e+17 3.033e+17   17.990 2.26e-05 ***
crime                       1 9.873e+17 9.873e+17   58.557 2.40e-14 ***
fantasy                     1 4.826e+17 4.826e+17   28.626 9.21e-08 ***
thriller                    1 1.904e+17 1.904e+17   11.294 0.000784 ***
animation                   1 3.384e+18 3.384e+18  200.703  < 2e-16 ***
adventure                   1 8.017e+17 8.017e+17   47.551 6.09e-12 ***
mystery                     1 7.549e+13 7.549e+13    0.004 0.946652    
war                         1 3.601e+16 3.601e+16    2.136 0.143959    
romance                     1 1.593e+17 1.593e+17    9.448 0.002126 ** 
music                       1 5.599e+16 5.599e+16    3.321 0.068468 .  
family                      1 7.977e+17 7.977e+17   47.314 6.87e-12 ***
western                     1 1.253e+17 1.253e+17    7.434 0.006423 ** 
history                     1 7.467e+16 7.467e+16    4.429 0.035387 *  
tv_movie                    1 5.466e+15 5.466e+15    0.324 0.569123    
foreign                     1 1.126e+16 1.126e+16    0.668 0.413839    
paramount                   1 1.942e+17 1.942e+17   11.516 0.000696 ***
mgm                         1 3.142e+16 3.142e+16    1.863 0.172293    
twentieth_century           1 1.572e+17 1.572e+17    9.327 0.002271 ** 
warner_bros                 1 2.689e+15 2.689e+15    0.159 0.689659    
universal                   1 4.717e+17 4.717e+17   27.980 1.28e-07 ***
columbia                    1 8.226e+16 8.226e+16    4.879 0.027230 *  
rko_radio                   1 4.835e+16 4.835e+16    2.868 0.090443 .  
united_artists              1 1.069e+17 1.069e+17    6.342 0.011824 *  
walt_disney                 1 8.000e+17 8.000e+17   47.453 6.40e-12 ***
new_line_cinema             1 3.783e+16 3.783e+16    2.244 0.134213    
prod_size                   1 7.599e+16 7.599e+16    4.507 0.033810 *  
num_production_companies    1 2.429e+17 2.429e+17   14.406 0.000149 ***
production_country         31 6.640e+17 2.142e+16    1.270 0.144673    
adjusted_budget             1 9.795e+18 9.795e+18  580.978  < 2e-16 ***
Residuals                4558 7.685e+19 1.686e+16                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

As we know from DMA the cutoff p value is 0.05 so anything above that is not important and anything below that is important. The stars beside the variable also tell us level of importance, but do not give us a clear outcome on which variables are the most important as it just says <2e-16 telling us that they are very important but not ranking them.

There are 6 variables that have a p value of <2e-16, and three of them are vote_count, adjusted_budget and is_in_collection so the top 3 are the same for each feature selection.

The number of features selected is relatively the same but there are differences in the features selected for example unspecified cast is deemed not important by the ANOVA model but it is important according to the Boruta model.

After running the models we can see which one generally performed better

Random Forest Model using Boruta Features

As mentioned at the beginning of the notebook we wanted to run the model multiple times because of the range in error based on the train test split. for the NA model we will be running it 30 times as it did not take toooo long to run. The reason we are using 200 trees was explained in the second notebook and that is where we plotted the random forest to see what the ideal number of trees would be.

#We want to stroe the error metrics to analyze later 
rmse_na_boruta <- c()
mape_na_boruta <- c()

for(i in 1:30){
  
  # Train test split
  num_samples = dim(data_na)[1]
  sampling.rate = 0.8
  training <- sample(1:num_samples, sampling.rate * num_samples, replace=FALSE)
  trainingSet <- subset(data_na[training, ])
  testing <- setdiff(1:num_samples,training)
  testingSet <- subset(data_na[testing, ])
  
  #Train the model
  randomForestModel <- randomForest(adjusted_revenue ~ popularity + release_date + runtime + vote_average + 
    vote_count + meterScore + meterClass + is_in_collection + 
    has_tagline + number_of_cast + female_cast + male_cast + 
    unspecified_cast + number_of_crew + female_crew + male_crew + 
    unspecified_crew + comedy + horror + action + drama + fantasy + 
    thriller + animation + adventure + romance + family + twentieth_century + 
    warner_bros + universal + walt_disney + prod_size + num_production_companies + 
    production_country + adjusted_budget, data=trainingSet, ntree=200)
  
  #Calculate the error
  predictions <- predict(randomForestModel, testingSet)
  error = predictions - testingSet$adjusted_revenue
  mse = mean(error^2)
  rmse_na_boruta[i] <- sqrt(mse)
  errorpct <- ((abs(testingSet$adjusted_revenue - predictions))/testingSet$adjusted_revenue)
  mape_na_boruta[i] <- mean(errorpct)
  
}
rmse_na_boruta
 [1] 124737050 102558693 122492259 116916375 127019762 107007146 123109015 122740541 117728775 119541009
[11] 118853588 120273504 117118753 119396163 123954153 117978901 113296198 121125617 111351457 122997194
[21] 112044798 123644663 136100863 123790583 115179253 132892067 119202705 108628987 130820551 122639466
mape_na_boruta
 [1]  34.20468  30.66354 152.84844  22.76226  96.28503  31.41667 105.11008 121.31701  88.21617  33.38971
[11]  27.86385  22.46998  29.27227  32.18365 107.26778  99.29217 100.83019  89.48064 143.62672  16.43567
[21]  29.07167  98.65965 104.70093  50.53272  14.62367  45.66011  49.17229 110.73941  38.78067 132.50135

We initially see that RMSE is really large $100M range but we know that the error is significantly exaggerated as the errors are already big, plus there are 5K data points so it is not a good measure of error. For this model we were able to run 30 train test splits and we can see that the range in MAPE is really big from 14% to 152% we can calculate the average on that to see

mean(rmse_na_boruta)
[1] 119838003
mean(mape_na_boruta)
[1] 68.64596

The average MAPE is 68% which is not the best but we already know that we are not able to predict the box office with our variables very accurately but we can compare feature selection models

To compare this model against the models done in the second R notebook we will take RMSE/ average adjusted_revenue

mean(rmse_na_boruta)/mean(data_na$adjusted_revenue)
[1] 0.8486409

This number is significantly less than the numbers that we got in the second notebook, this shows that feature selection definitely improves accuracy. There is also the added benefit of less computational power required as we are analyzing less features. We can also see the effect RMSE has and how much it exaggerated the error by ~15% in this case.

Random Forest Model using ANOVA Features

Similar to the model above we will be running it 30 times with 200 trees.

rmse_na_anova <- c()
mape_na_anova <- c()

for(i in 1:30){
  num_samples = dim(data_na)[1]
  sampling.rate = 0.8
  training <- sample(1:num_samples, sampling.rate * num_samples, replace=FALSE)
  trainingSet <- subset(data_na[training, ])
  testing <- setdiff(1:num_samples,training)
  testingSet <- subset(data_na[testing, ])
  randomForestModel <- randomForest(adjusted_revenue ~ .- meterScore - num_spoken_languages - number_of_cast - unspecified_cast  - action - documentary - mystery - war - music - tv_movie - foreign - mgm - warner_bros - rko_radio - new_line_cinema - production_country, data=trainingSet, ntree=200)
  
  predictions <- predict(randomForestModel, testingSet)
  error = predictions - testingSet$adjusted_revenue
  mse = mean(error^2)
  rmse_na_anova[i] <- sqrt(mse)
  errorpct <- ((abs(testingSet$adjusted_revenue - predictions))/testingSet$adjusted_revenue)
  mape_na_anova[i] <- mean(errorpct)
  
}
rmse_na_anova
 [1] 140488870 102845115 117452615 101737333 114887395 125011860 137748987 127557539 115593173 123881524
[11] 117964319 108750962 102722889 124210814 101715401 111854977 124251609 119389020 122300600 101127190
[21] 127151025 118376500 134055430 136803209 127967431 133128920 108916743 120660171 113641974 115480796
mape_na_anova
 [1]  98.42916  53.70034  51.86697  28.61978  44.88179  20.40744  42.70237  20.94986  42.46423  40.16076
[11]  18.14329  84.97976 111.69401  25.81500  32.95896 100.36770  23.03898  16.53286  69.27055  44.28644
[21]  76.80656  19.69833  52.00450  60.92622  23.09176  38.14424 134.95396  87.59915  37.24281  58.85535

We initially see that RMSE is really large $100M range but we know that the error is significantly exaggerated as the errors are already big, plus there are 5K data points so it is not a good measure of error. looking at the MAPE’s range is smaller than the range of MAPE for the boruta model, ranging from 16% to 111%

mean(rmse_na_anova)
[1] 119255813
mean(mape_na_anova)
[1] 52.01977

As we could already see the features from ANOVA performed better. Now something that we realized after running the model is that we cannot actually compare right now, because the train test split is different so this result could be because of the train test split but not because of the train test split rather than the results. This is something we wanted to fix but this model needs to run overnight and now we do not have any more time as we have to submit

To compare this model against the models done in the second R notebook we will take RMSE/ average adjusted_revenue

mean(rmse_na_anova)/mean(data_na$adjusted_revenue)
[1] 0.8445181

This error metric is very comparable to the ANOVA model done in notebook two. The reason for that is because it is done on the same data set which means that we can see the impact that just adjusting revenue and budget for time has on the model. So in this case we are getting a rmse as percentage of revenue at 84% where as in book two we got a result of 189%, so the adjustment for time value of money definately had a very positive impact on the accuracy of the model.

##Feature Selection on impuated data

Read in the file

data_rf = read.csv("rf_imputations_3.csv")
head(data_rf)

Do the same basic type conversions

#converting classes 
data_rf <- mutate_if(data_rf, is.factor, as.character())
data_rf$budget <- as.numeric(data_rf$budget)
data_rf$Total.Revenue = as.numeric(data_rf$Total.Revenue)
head(data_rf)

Extract year from the release date column

data_rf$Year = str_extract(data_rf$release_date, "\\d{4}")
head(data_rf)

Find the adjustment value based on the base year of 2020 (scrape the US Beaurea website for information about this)

table = inflation_adjust(2020)
trying URL 'https://download.bls.gov/pub/time.series/cu/cu.data.1.AllItems'
Content type 'text/plain' length unknown
downloaded 2.3 MB
table

Extract the adj_value in a simple form and put it into a data frame so we can join it

table <- as.data.frame(table)
table$adj_value2 <- ((100 + table$pct_increase)/100)
df <- table[,c("year","adj_value2")]
colnames(df) = c("Year", "adj_value") #changing name for left_join
df

Join the main data with the adjustment and join by Year

data_rf = left_join(df, data_rf, by = 'Year') 
data_rf

Calculate the adjusted Budget and Revenue, and convert the value to an integer

data_rf$adjusted_revenue = as.integer(data_rf$Total.Revenue/data_rf$adj_value)
NAs introduced by coercion to integer range
data_rf$adjusted_budget = as.integer(data_rf$budget/data_rf$adj_value)
data_rf

This introduced some NAs so we will remove them

data_rf <- drop_na(data_rf)

Drop the columns that we will not be needing anymore

data_rf <- subset(data_rf, select = -c(Year, adj_value, budget, Total.Revenue, title, X, original_language))

Change type of variables to factor and date accordingly

data_rf$release_date <- as.Date(data_rf$release_date)
data_rf <- data_rf %>% mutate_if(is.logical,as.factor)
data_rf <- data_rf %>% mutate_if(is.character,as.factor)
head(data_rf)

Resursive Feature Selection with Boruta on Imputed Dataset

It is very easy to run the boruta model, but can be time consuming in many cases, so we will maximize the runs at 60, by which almost all of the variables will be classified as important or unimportant.

featureSelection_rf <- Boruta(adjusted_revenue ~ ., data = data_rf, doTrace = 2, maxRuns = 30)
 1. run of importance source...
Growing trees.. Progress: 82%. Estimated remaining time: 6 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 37 seconds.
Computing permutation importance.. Progress: 48%. Estimated remaining time: 1 minute, 7 seconds.
Computing permutation importance.. Progress: 72%. Estimated remaining time: 36 seconds.
Computing permutation importance.. Progress: 96%. Estimated remaining time: 5 seconds.
 2. run of importance source...
Growing trees.. Progress: 89%. Estimated remaining time: 3 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 38 seconds.
Computing permutation importance.. Progress: 48%. Estimated remaining time: 1 minute, 6 seconds.
Computing permutation importance.. Progress: 73%. Estimated remaining time: 35 seconds.
Computing permutation importance.. Progress: 97%. Estimated remaining time: 4 seconds.
 3. run of importance source...
Growing trees.. Progress: 88%. Estimated remaining time: 4 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 40 seconds.
Computing permutation importance.. Progress: 47%. Estimated remaining time: 1 minute, 8 seconds.
Computing permutation importance.. Progress: 71%. Estimated remaining time: 38 seconds.
Computing permutation importance.. Progress: 95%. Estimated remaining time: 7 seconds.
 4. run of importance source...
Growing trees.. Progress: 88%. Estimated remaining time: 4 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 37 seconds.
Computing permutation importance.. Progress: 49%. Estimated remaining time: 1 minute, 5 seconds.
Computing permutation importance.. Progress: 74%. Estimated remaining time: 33 seconds.
Computing permutation importance.. Progress: 98%. Estimated remaining time: 3 seconds.
 5. run of importance source...
Growing trees.. Progress: 87%. Estimated remaining time: 4 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 38 seconds.
Computing permutation importance.. Progress: 48%. Estimated remaining time: 1 minute, 7 seconds.
Computing permutation importance.. Progress: 69%. Estimated remaining time: 42 seconds.
Computing permutation importance.. Progress: 90%. Estimated remaining time: 14 seconds.
 6. run of importance source...
Growing trees.. Progress: 78%. Estimated remaining time: 8 seconds.
Computing permutation importance.. Progress: 21%. Estimated remaining time: 1 minute, 53 seconds.
Computing permutation importance.. Progress: 41%. Estimated remaining time: 1 minute, 30 seconds.
Computing permutation importance.. Progress: 63%. Estimated remaining time: 53 seconds.
Computing permutation importance.. Progress: 84%. Estimated remaining time: 23 seconds.
 7. run of importance source...
Growing trees.. Progress: 73%. Estimated remaining time: 11 seconds.
Computing permutation importance.. Progress: 19%. Estimated remaining time: 2 minutes, 8 seconds.
Computing permutation importance.. Progress: 40%. Estimated remaining time: 1 minute, 35 seconds.
Computing permutation importance.. Progress: 60%. Estimated remaining time: 1 minute, 2 seconds.
Computing permutation importance.. Progress: 81%. Estimated remaining time: 29 seconds.
 8. run of importance source...
Growing trees.. Progress: 73%. Estimated remaining time: 11 seconds.
Computing permutation importance.. Progress: 23%. Estimated remaining time: 1 minute, 41 seconds.
Computing permutation importance.. Progress: 47%. Estimated remaining time: 1 minute, 8 seconds.
Computing permutation importance.. Progress: 72%. Estimated remaining time: 36 seconds.
Computing permutation importance.. Progress: 96%. Estimated remaining time: 5 seconds.
 9. run of importance source...
Growing trees.. Progress: 89%. Estimated remaining time: 3 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 37 seconds.
Computing permutation importance.. Progress: 48%. Estimated remaining time: 1 minute, 6 seconds.
Computing permutation importance.. Progress: 73%. Estimated remaining time: 35 seconds.
Computing permutation importance.. Progress: 97%. Estimated remaining time: 3 seconds.
 10. run of importance source...
Growing trees.. Progress: 87%. Estimated remaining time: 4 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 39 seconds.
Computing permutation importance.. Progress: 48%. Estimated remaining time: 1 minute, 8 seconds.
Computing permutation importance.. Progress: 72%. Estimated remaining time: 36 seconds.
Computing permutation importance.. Progress: 96%. Estimated remaining time: 5 seconds.
 11. run of importance source...
Growing trees.. Progress: 88%. Estimated remaining time: 4 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 39 seconds.
Computing permutation importance.. Progress: 48%. Estimated remaining time: 1 minute, 7 seconds.
Computing permutation importance.. Progress: 72%. Estimated remaining time: 36 seconds.
Computing permutation importance.. Progress: 96%. Estimated remaining time: 5 seconds.
 12. run of importance source...
Growing trees.. Progress: 88%. Estimated remaining time: 4 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 37 seconds.
Computing permutation importance.. Progress: 49%. Estimated remaining time: 1 minute, 6 seconds.
Computing permutation importance.. Progress: 73%. Estimated remaining time: 35 seconds.
Computing permutation importance.. Progress: 97%. Estimated remaining time: 4 seconds.
 13. run of importance source...
Growing trees.. Progress: 89%. Estimated remaining time: 3 seconds.
Computing permutation importance.. Progress: 24%. Estimated remaining time: 1 minute, 37 seconds.
Computing permutation importance.. Progress: 48%. Estimated remaining time: 1 minute, 6 seconds.
Computing permutation importance.. Progress: 72%. Estimated remaining time: 35 seconds.
Computing permutation importance.. Progress: 96%. Estimated remaining time: 4 seconds.
After 13 iterations, +37 mins: 
 confirmed 30 attributes: action, adjusted_budget, adventure, animation, comedy and 25 more;
 rejected 11 attributes: foreign, history, mgm, mystery, num_spoken_languages and 6 more;
 still have 11 attributes left.

 14. run of importance source...
Growing trees.. Progress: 99%. Estimated remaining time: 0 seconds.
Computing permutation importance.. Progress: 32%. Estimated remaining time: 1 minute, 5 seconds.
Computing permutation importance.. Progress: 65%. Estimated remaining time: 33 seconds.
Computing permutation importance.. Progress: 98%. Estimated remaining time: 2 seconds.
 15. run of importance source...
Growing trees.. Progress: 98%. Estimated remaining time: 0 seconds.
Computing permutation importance.. Progress: 33%. Estimated remaining time: 1 minute, 4 seconds.
Computing permutation importance.. Progress: 65%. Estimated remaining time: 33 seconds.
Computing permutation importance.. Progress: 98%. Estimated remaining time: 1 seconds.
 16. run of importance source...
Computing permutation importance.. Progress: 35%. Estimated remaining time: 58 seconds.
Computing permutation importance.. Progress: 71%. Estimated remaining time: 25 seconds.
 17. run of importance source...
Computing permutation importance.. Progress: 36%. Estimated remaining time: 55 seconds.
Computing permutation importance.. Progress: 73%. Estimated remaining time: 22 seconds.
After 17 iterations, +46 mins: 
 rejected 2 attributes: paramount, rko_radio;
 still have 9 attributes left.

 18. run of importance source...
Computing permutation importance.. Progress: 35%. Estimated remaining time: 56 seconds.
Computing permutation importance.. Progress: 71%. Estimated remaining time: 24 seconds.
 19. run of importance source...
Computing permutation importance.. Progress: 35%. Estimated remaining time: 56 seconds.
Computing permutation importance.. Progress: 71%. Estimated remaining time: 24 seconds.
 20. run of importance source...
Computing permutation importance.. Progress: 38%. Estimated remaining time: 51 seconds.
Computing permutation importance.. Progress: 75%. Estimated remaining time: 20 seconds.
 21. run of importance source...
Computing permutation importance.. Progress: 37%. Estimated remaining time: 51 seconds.
Computing permutation importance.. Progress: 77%. Estimated remaining time: 18 seconds.
 22. run of importance source...
Computing permutation importance.. Progress: 39%. Estimated remaining time: 48 seconds.
Computing permutation importance.. Progress: 77%. Estimated remaining time: 18 seconds.
 23. run of importance source...
Computing permutation importance.. Progress: 39%. Estimated remaining time: 48 seconds.
Computing permutation importance.. Progress: 78%. Estimated remaining time: 17 seconds.
 24. run of importance source...
Computing permutation importance.. Progress: 39%. Estimated remaining time: 48 seconds.
Computing permutation importance.. Progress: 78%. Estimated remaining time: 17 seconds.
After 24 iterations, +59 mins: 
 confirmed 1 attribute: warner_bros;
 rejected 1 attribute: new_line_cinema;
 still have 7 attributes left.

 25. run of importance source...
Computing permutation importance.. Progress: 40%. Estimated remaining time: 46 seconds.
Computing permutation importance.. Progress: 81%. Estimated remaining time: 14 seconds.
 26. run of importance source...
Computing permutation importance.. Progress: 40%. Estimated remaining time: 46 seconds.
Computing permutation importance.. Progress: 80%. Estimated remaining time: 15 seconds.
 27. run of importance source...
Computing permutation importance.. Progress: 40%. Estimated remaining time: 46 seconds.
Computing permutation importance.. Progress: 81%. Estimated remaining time: 14 seconds.
After 27 iterations, +1.1 hours: 
 confirmed 1 attribute: unspecified_cast;
 still have 6 attributes left.

 28. run of importance source...
Computing permutation importance.. Progress: 40%. Estimated remaining time: 46 seconds.
Computing permutation importance.. Progress: 81%. Estimated remaining time: 14 seconds.
 29. run of importance source...
Computing permutation importance.. Progress: 39%. Estimated remaining time: 49 seconds.
Computing permutation importance.. Progress: 79%. Estimated remaining time: 16 seconds.

After 30 runs which took around 1.5 hours it was able to classify all but 3 features which is good

Plot the feature selection to get more information about it

plot(featureSelection_rf, las = 2, cex.axis = 0.5)

Majority of the features are classified the same way for both the dataset, so the imputed data set does not significantly change the featrues used.

Do a tentative fix to the tentative features (assign them to either important or unimportant) and get the formula

featureSelectionFinal_rf <- TentativeRoughFix(featureSelection_rf)
getNonRejectedFormula(featureSelectionFinal_rf)
adjusted_revenue ~ popularity + release_date + runtime + vote_average + 
    vote_count + meterScore + is_in_collection + has_tagline + 
    number_of_cast + female_cast + male_cast + unspecified_cast + 
    number_of_crew + female_crew + male_crew + unspecified_crew + 
    comedy + horror + action + drama + documentary + crime + 
    thriller + animation + adventure + romance + family + twentieth_century + 
    warner_bros + columbia + walt_disney + new_line_cinema + 
    prod_size + num_production_companies + production_country + 
    adjusted_budget
<environment: 0x000002487e8d5108>

just take a look at the feature selection overall again

featureSelection_rf
Boruta performed 59 iterations in 2.574235 hours.
 34 attributes confirmed important: action, adjusted_budget, adventure, animation,
columbia and 29 more;
 15 attributes confirmed unimportant: fantasy, foreign, history, mgm, music and 10
more;
 3 tentative attributes left: crime, meterClass, new_line_cinema;

Feature Selection with ANOVA on Imputed Dataset

Do a one way ANOVA of all the variables against adjusted_revenue

one.way_rf<- aov(adjusted_revenue ~ ., data = data_rf)
summary(one.way_rf)
                            Df    Sum Sq   Mean Sq   F value   Pr(>F)    
popularity                   1 6.614e+19 6.614e+19 11954.568  < 2e-16 ***
release_date                 1 1.644e+18 1.644e+18   297.183  < 2e-16 ***
runtime                      1 3.093e+18 3.093e+18   559.137  < 2e-16 ***
vote_average                 1 5.643e+17 5.643e+17   102.000  < 2e-16 ***
vote_count                   1 1.161e+20 1.161e+20 20985.186  < 2e-16 ***
meterScore                   1 2.110e+16 2.110e+16     3.815 0.050819 .  
meterClass                   2 2.879e+16 1.439e+16     2.602 0.074165 .  
is_in_collection             1 5.389e+18 5.389e+18   974.151  < 2e-16 ***
num_spoken_languages         1 2.173e+16 2.173e+16     3.928 0.047504 *  
has_tagline                  1 1.765e+17 1.765e+17    31.910 1.64e-08 ***
number_of_cast               1 2.001e+17 2.001e+17    36.164 1.85e-09 ***
female_cast                  1 1.424e+17 1.424e+17    25.738 3.95e-07 ***
male_cast                    1 1.286e+18 1.286e+18   232.484  < 2e-16 ***
unspecified_cast             1 1.008e+16 1.008e+16     1.822 0.177082    
number_of_crew               1 1.719e+17 1.719e+17    31.080 2.51e-08 ***
female_crew                  1 2.943e+16 2.943e+16     5.319 0.021101 *  
male_crew                    1 7.688e+17 7.688e+17   138.970  < 2e-16 ***
unspecified_crew             1 1.399e+14 1.399e+14     0.025 0.873675    
comedy                       1 2.397e+17 2.397e+17    43.330 4.74e-11 ***
horror                       1 8.514e+17 8.514e+17   153.895  < 2e-16 ***
action                       1 1.695e+15 1.695e+15     0.306 0.579967    
drama                        1 5.098e+17 5.098e+17    92.152  < 2e-16 ***
documentary                  1 1.829e+15 1.829e+15     0.331 0.565298    
science_fiction              1 1.829e+17 1.829e+17    33.059 9.07e-09 ***
crime                        1 6.195e+17 6.195e+17   111.971  < 2e-16 ***
fantasy                      1 5.125e+17 5.125e+17    92.646  < 2e-16 ***
thriller                     1 1.751e+17 1.751e+17    31.652 1.87e-08 ***
animation                    1 6.767e+17 6.767e+17   122.314  < 2e-16 ***
adventure                    1 1.229e+18 1.229e+18   222.110  < 2e-16 ***
mystery                      1 1.158e+15 1.158e+15     0.209 0.647251    
war                          1 4.837e+15 4.837e+15     0.874 0.349786    
romance                      1 8.649e+16 8.649e+16    15.633 7.72e-05 ***
music                        1 3.395e+16 3.395e+16     6.138 0.013242 *  
family                       1 9.038e+17 9.038e+17   163.377  < 2e-16 ***
western                      1 7.320e+16 7.320e+16    13.232 0.000276 ***
history                      1 4.236e+15 4.236e+15     0.766 0.381575    
tv_movie                     1 4.181e+15 4.181e+15     0.756 0.384668    
foreign                      1 5.709e+15 5.709e+15     1.032 0.309729    
paramount                    1 5.696e+17 5.696e+17   102.958  < 2e-16 ***
mgm                          1 1.343e+16 1.343e+16     2.428 0.119213    
twentieth_century            1 3.535e+17 3.535e+17    63.894 1.38e-15 ***
warner_bros                  1 1.198e+17 1.198e+17    21.656 3.28e-06 ***
universal                    1 7.951e+17 7.951e+17   143.726  < 2e-16 ***
columbia                     1 3.248e+17 3.248e+17    58.719 1.90e-14 ***
rko_radio                    1 4.041e+16 4.041e+16     7.305 0.006883 ** 
united_artists               1 1.399e+17 1.399e+17    25.286 4.99e-07 ***
walt_disney                  1 1.949e+18 1.949e+18   352.375  < 2e-16 ***
new_line_cinema              1 4.907e+16 4.907e+16     8.869 0.002904 ** 
prod_size                    1 5.327e+16 5.327e+16     9.630 0.001917 ** 
num_production_companies     1 2.998e+17 2.998e+17    54.196 1.89e-13 ***
production_country          31 6.160e+17 1.987e+16     3.592 5.93e-11 ***
adjusted_budget              1 6.591e+18 6.591e+18  1191.375  < 2e-16 ***
Residuals                19637 1.086e+20 5.532e+15                       
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Once again majority of the features’ classification is the same as the ANOVE on dataset where all the NA’s were dropped, but thee is an interesting difference where meter Score was very significant in the NA dataset, but on the imputed one it is not important, this could possibly mean that the imputations for this variable are not very accurate or this can mean that meterClass had a big impact on the movies in the first dataset but not as much of an impact when the other 15K movies were included.

Random Forest Model using Boruta Features

As mentioned at the beginning of the notebook we wanted to run the model multiple times because of the range in error based on the train test split. for the NA model we will be running it 3 times as it took a long to run. The reason we are using 200 trees was explained in the second notebook and that is where we plotted the random forest to see what the ideal number of trees would be.


rmse_rf_boruta <- c()
mape_rf_boruta <- c()

for(i in 1:3){
  # Train test split
  num_samples = dim(data_rf)[1]
  sampling.rate = 0.8
  training <- sample(1:num_samples, sampling.rate * num_samples, replace=FALSE)
  trainingSet <- subset(data_rf[training, ])
  testing <- setdiff(1:num_samples,training)
  testingSet <- subset(data_rf[testing, ])
  
  #Train the model
  randomForestModel <- randomForest(adjusted_revenue ~ popularity + release_date + 
    runtime + vote_average + vote_count + meterScore + meterClass + 
    is_in_collection + has_tagline + number_of_cast + female_cast + 
    male_cast + unspecified_cast + number_of_crew + female_crew + 
    male_crew + unspecified_crew + comedy + horror + action + 
    drama + documentary + thriller + animation + adventure + 
    romance + family + twentieth_century + warner_bros + columbia + 
    walt_disney + prod_size + num_production_companies + production_country + 
    adjusted_budget, data=trainingSet, ntree=200)
  
  #Calculate the error
  predictions <- predict(randomForestModel, testingSet)
  error = predictions - testingSet$adjusted_revenue
  mse = mean(error^2)
  rmse_rf_boruta[i] <- sqrt(mse)
  errorpct <- ((abs(testingSet$adjusted_revenue - predictions))/testingSet$adjusted_revenue)
  mape_rf_boruta[i] <- mean(errorpct)
}
rmse_rf_boruta
[1] 69699126 61177562 59283379
mape_rf_boruta
[1] 132.6351 106.4524 188.4149

We initially see that RMSE is really large $50M range but we know that the error is significantly exaggerated as the errors are already big, plus there are 20K data points so it is not a good measure of error. For this model we were only able to run it 3 times it takes a very long time for each run. So the error is hard to take at its current value. The interesting part is that the RMSE decreased significantly while the MAPE increased significantly. This is likely because all the NA’s that we dropped in the first data set resulted in the movies that were left to have higher revenue. This makes sense as movies that are bigger (more boxoffice) are generally more well documented. This can be confirmed by taking the average of both revenue columns

mean(data_na$adjusted_revenue)
[1] 141211676
mean(data_rf$adjusted_revenue)
[1] 42918089

The hypothesis is confirmed and the imputed data has 4 times the data points be the average revenue decreases significantly

mean(rmse_rf_boruta)
[1] 63386689
mean(mape_rf_boruta)
[1] 142.5008

We took the average to compare to the NA values, and the MAPE is ~2.5x worse which is not good. This could be because there are errors that happen when imputing data and then using imputed data to predict adds to the error. This likely leads us to the conclusion that imputing the data does not always result in better error results, at least not with “simple” models like random forest.

To compare this model against the models done in the second R notebook we will take RMSE/ average adjusted_revenue

mean(rmse_rf_boruta)/mean(data_rf$adjusted_revenue)
[1] 1.476922

This error metric is less than the results in the second notebook. While it is not significantly less it shows that at the very least feature selection has a positive impact on the error.

Random Forest Model using ANOVA Features

Similar to the model above we will be running it 3 times with 200 trees.

rmse_rf_anova <- c()
mape_rf_anova <- c()

for(i in 1:3){
  #Train test split
  num_samples = dim(data_rf)[1]
  sampling.rate = 0.8
  training <- sample(1:num_samples, sampling.rate * num_samples, replace=FALSE)
  trainingSet <- subset(data_rf[training, ])
  testing <- setdiff(1:num_samples,training)
  testingSet <- subset(data_rf[testing, ])
  
  #Train the model
  randomForestModel <- randomForest(adjusted_revenue ~ . - meterClass - meterScore - unspecified_cast - unspecified_crew - action - documentary - mystery - war - history - tv_movie - foreign - mgm, data=trainingSet, ntree=200)
  
  #Calcualte the error
  predictions <- predict(randomForestModel, testingSet)
  error = predictions - testingSet$adjusted_revenue
  mse = mean(error^2)
  rmse_rf_anova[i] <- sqrt(mse)
  errorpct <- ((abs(testingSet$adjusted_revenue - predictions))/testingSet$adjusted_revenue)
  mape_rf_anova[i] <- mean(errorpct)
  
}
rmse_rf_anova
[1] 58030853 58763061 62084336
mape_rf_anova
[1] 112.7862 148.1727 139.2491

The RMSE and MAPE is lower for the ANOVA selection then the boruta selection again this leads us to believe that maybe in this case ANOVA selection is actually better than the boruta selection. but once again we do not actually know this because we did not run it on the same train/test split.

mean(rmse_rf_anova)
[1] 59626083
mean(mape_rf_anova)
[1] 133.4027

To compare this model against the models done in the second R notebook we will take RMSE/ average adjusted_revenue

mean(rmse_rf_anova)/mean(data_rf$adjusted_revenue)
[1] 1.3893

Once again this error metric is less than the results in the second notebook

Next Steps

As we do not have a lot of computational power we were not able to run a neural network model. It was already taking the entire night to run the random forest model so neural net models would have taken much longer. We believe that if we are able to find the correct structure to a neural net model then it would perform better on the imputed data compared to the data where we drop the na’s. With 4 times the amount of training data we are able to better train the model and decrease MAPE.

Another thing that we were not able to do in this project is some sort of optimization where we determine the correct combination of variables to optimize revenue.

Another interesting problem that we are facing is that the MAPE is drastically different based on the train/test split and we wanted to learn what could be the reason for this and how can we fix it.

LS0tDQp0aXRsZTogIk1vdmllIEJveG9mZmljZSBQcmVkaWN0aW9ucyAzIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCg0KVGhpcyBpcyB0aGUgVGhpcmQgTm90ZWJvb2sgdGhhdCB3ZSBhcmUgcHJlc2VudGluZywgSW4gdGhpcyBub3RlYm9vayB3ZSBhcmUgbG9va2luZyB0byBkbyBhbGwgb2YgdGhlIHN1Z2dlc3Rpb25zIGdpdmVuIHRvIHVzIGZyb20gdGhlIHByZXNlbnRhdGlvbiBhbmQgc29tZSBvZiB0aGUgbmV4dCBzdGVwcyBvdXRsaW5lZCBpbiB0aGUgcHJlc2VudGF0aW9uLiANCg0KVGhlcmUgYXJlIDMgYmlnIHRoaW5ncyB0aGF0IHdlIHdhbnQgdG8gZG8gaW4gdGhpcyBub3RlYm9vay4gVGhlIGZpcnN0IGlzIHdlIHdhbnQgdG8gZG8gaXMgYWRqdXN0IHRoZSByZXZlbnVlIGFuZCBidWRnZXQgZm9yIHRoZSB0aW1lIHZhbHVlIG9mIG1vbmV5LiBBZnRlciB0aGF0IHdlIHdhbnQgdG8gZG8gZmVhdHVyZSBzZWxlY3Rpb24gdXNpbmcgdHdvIGRpZmZlcmVudCBtZXRob2RzOiByZWN1cnNpdmUgZmVhdHVyZSBzZWxlY3Rpb24gKGJydXRlIGZvcmNlIG1ldGhvZCB1c2luZyBhIGxpYnJhcnkgY2FsbGVkIGJvcnV0YSkgYW5kIGZlYXR1cmUgc2VsZWN0aW9uIHVzaW5nIEFOT1ZBLiBXZSB3aWxsIGJlIHJ1bm5pbmcgdGhpcyBvbiB0d28gZGlmZmVyZW50IGRhdGEgc2V0cyBuYW1lbHkgdGhlIE5BIG9uZSB3aGVyZSB3ZSBkcm9wIGFsbCBOQXMgYW5kIHRoZSBpbXB1dGVkIGRhdGEgc2V0LiBXZSBhcmUgZG9pbmcgaXQgb24gdGhlc2UgZGF0YSBzZXRzIHNwZWNpZmljYWxseSBiZWNhdXNlIHRoZXkgd2VyZSB3aGF0IHdlIGFuYWx5emVkIGluIHRoZSBzZWNvbmQgbm90ZWJvb2sgYW5kIHNhdyBpbnRlcmVzdGluZyByZXN1bHRzIHRoZXJlLiBUaGVyZSB3YXMgYSBiaXQgYWJvdXQgQU5PVkEgbWVudGlvbmVkIGluIHRoZSBzZWNvbmQgbm90ZWJvb2sgYW5kIGFsbCBvZiB0aGF0IGFwcGxpZXMgaGVyZSBhcyB3ZWxsLiBUaGUgcmVhc29uIHRoYXQgd2UgYXJlIGRvaW5nIHR3byBmZWF0dXJlIHNlbGVjdGlvbnMgaXMgYmVjYXVzZSB3ZSB3YW50IHRvIHRlc3QgdGhlIGRpZmZlcmVuY2UgYmV0d2VlbiB0aGVtLiBXZSB3YW50IHRvIHNlZSB3aGF0IGZlYXR1cmVzIHRoZXkgcmVjb21tZW5kIGluIGNvbW1vbiBhbmQgdGhlIGRpZmZlcmVuY2VzLiBBZnRlciB3aGljaCB3ZSB3aWxsIHNlZSB3aGljaCBmZWF0dXJlIHNlbGVjdGlvbiBtZXRob2QgcGVyZm9ybXMgYmV0dGVyIHVzaW5nIGVycm9yIG1ldHJpY3MuIFdlIHdvdWxkIGFsc28gaWRlYWxseSBsaWtlIHRvIGRvIG11bHRpcGxlIHRyYWluL3Rlc3Qgc3BsaXRzIGFzIHdlIGtub3cgdGhhdCB0aGUgZXJyb3IgcmF0ZSBoZWF2aWx5IGRlcGVuZHMgb24gdGhlIHRyYWluL3Rlc3Qgc3BsaXQgYW5kIGdpdmUgYSBodWdlIHJhbmdlIG9mIDEwMCUgKE1BUEUpIGluIHNvbWUgY2FzZXMuIFdlIGFsc28gcmVjb2duaXplIHRoYXQgdGhpcyBtYXkgbm90IGJlIHBvc3NpYmxlIGNvbXB1dGF0aW9uYWxseSBhcyBpdCBhbHJlYWR5IHRha2UgYSBzaWduaWZpY2FudCBhbW91bnQgb2YgdGltZSB0byBydW4gdGhlIG1vZGVsIG9uY2UgbGV0IGFsb25lIHJ1bm5pbmcgaXQgbXVsdGlwbGUgdGltZXMuIA0KDQpXZSBhbHNvIGRpZCBhbiBBTk9WQSBmZWF0dXJlIHNlbGVjdGlvbiBpbiBub3RlYm9vayB0d28gYnV0IHRoZSByZWFzb24gd2UgYXJlIGRvaW5nIGl0IGhlcmUgaXMgdG8gY29tcGFyZSBmZWF0dXJlIHNlbGVjdGlvbiBtb2RlbHMgYXMgSm9lIHdhcyBpbnRlcmVzdGVkIGluIHdoaWNoIG9uZSB3b3VsZCBwZXJmb3JtIGJldHRlci4gQWRkaXRpb25hbGx5IHRoZSBBTk9WQSBmZWF0dXJlIHNlbGVjdGlvbiBpbiBub3RlYm9vayB0d28gd2FzIG9uIHRoZSBkYXRhc2V0IHdoZXJlIHRoZSB0aW1lIHZhbHVlIG9mIG1vbmV5IHdhcyBub3QgYWRqdXN0ZWQgZm9yLCBzbyB3ZSBjYW4gc2VlIHRoZSBpbXBhY3QgdGhhdCBjaGFuZ2luZyB0aW1lIHZhbHVlIG9mIG1vbmV5IGFsb25lIGhhcyBvbiB0aGUgcmVzdWx0cy4gDQoNCg0KDQpUaGUgZmlyc3QgdGhpbmcgd2UgbXVzdCBkbyBhcyBhbHdheXMgaXMgbG9hZCBhbGwgb2YgdGhlIGxpYnJhcmllcyB0aGF0IHdlIHdpbGwgYmUgdXNpbmcNCmBgYHtyfQ0KbGlicmFyeShyZWFkcikNCmxpYnJhcnkoc3RyaW5ncikNCmxpYnJhcnkodGlkeXZlcnNlKQ0KbGlicmFyeShkcGx5cikNCmxpYnJhcnkobWljZSkNCmxpYnJhcnkoVklNKQ0KbGlicmFyeShwbHlyKQ0KbGlicmFyeSh0aWR5cikNCmxpYnJhcnkoZ2dwbG90MikNCmxpYnJhcnkoc2YpDQpsaWJyYXJ5KHNqbWlzYykNCmxpYnJhcnkoaGlnaGNoYXJ0ZXIpDQpsaWJyYXJ5KG9wZW5haXIpDQpsaWJyYXJ5KHpvbykNCmxpYnJhcnkoY291bnRyeWNvZGUpDQpsaWJyYXJ5KGdnbWFwKQ0KbGlicmFyeShibHNjcmFwZVIpICMjbmVlZGVkIHRvIGdldCBpbmRleCBmb3IgYWRqdXN0aW5nIGluZmxhdGlvbg0KbGlicmFyeShCb3J1dGEpDQpsaWJyYXJ5KHJhbmRvbUZvcmVzdCkNCmxpYnJhcnkobWxiZW5jaCkNCmxpYnJhcnkoTWV0cmljcykNCmBgYA0KIyMgV29yayB3aXRoIHRoZSBOQXMgY3N2DQpUaGlzIGlzIHRoZSBjc3Ygd2hlcmUgYWxsIHRoZSBOQXMgYXJlIGluY2x1ZGVkIGFuZCB3ZSB3aWxsIGp1c3QgZHJvcCBhbGwgTkEgdmFsdWVzDQoNClJlYWQgaW4gdGhlIGRhdGEgdGhhdCB3ZSB3YW50IHRvIHdvcmsgd2l0aA0KYGBge3J9DQpkYXRhX25hID0gcmVhZC5jc3YoImFsbE1lcmdlX2NsZWFuX3dpdGhOQS5jc3YiKQ0KaGVhZChkYXRhX25hKQ0KYGBgDQoNCkRvIHNvbWUgYmFzaWMgY2xhc3MgY29udmVyc2lvbnMNCmBgYHtyfQ0KI2NvbnZlcnRpbmcgY2xhc3NlcyANCmRhdGFfbmEgPC0gbXV0YXRlX2lmKGRhdGFfbmEsIGlzLmZhY3RvciwgYXMuY2hhcmFjdGVyKCkpDQpkYXRhX25hJGJ1ZGdldCA8LSBhcy5udW1lcmljKGRhdGFfbmEkYnVkZ2V0KQ0KZGF0YV9uYSRUb3RhbC5SZXZlbnVlID0gYXMubnVtZXJpYyhkYXRhX25hJFRvdGFsLlJldmVudWUpDQpoZWFkKGRhdGFfbmEpDQpgYGANCg0KIyMjIEFkanVzdCBSZXZlbnVlIGFuZCBCdWRnZXQgZm9yIFRpbWUgVmFsdWUgb2YgTW9uZXkNCg0KRXh0cmFjdCB0aGUgWWVhciBmcm9tIHRoZSByZWxlYXNlX2RhdGUgY29sdW1uIGFuZCBzdG9yZSBpdCBpbiBhIHZhcmlhYmxlIGNhbGxlZCB5ZWFyIA0KYGBge3J9DQpkYXRhX25hJFllYXIgPSBzdHJfZXh0cmFjdChkYXRhX25hJHJlbGVhc2VfZGF0ZSwgIlxcZHs0fSIpDQpoZWFkKGRhdGFfbmEpDQpgYGANCg0KDQoNCkNyZWF0ZSBhIHRhYmxlIHRoYXQgd2lsbCBnaXZlIHVzIHRoZSBhZGp1c3RtZW50IGFtb3VudCBiYXNlZCBvbiBhIGJhc2UgeWVhciBvZiAyMDIwDQpgYGB7cn0NCnRhYmxlID0gaW5mbGF0aW9uX2FkanVzdCgyMDIwKQ0KdGFibGUNCmBgYA0KDQoNCkNyZWF0ZSBhIGRhdGEgZnJhbWUgd2l0aCB0aGUgdmFsdWVzIHdlIG5lZWQgZnJvbSB0aGUgdGFibGUNCmBgYHtyfQ0KdGFibGUgPC0gYXMuZGF0YS5mcmFtZSh0YWJsZSkNCnRhYmxlJGFkal92YWx1ZTIgPC0gKCgxMDAgKyB0YWJsZSRwY3RfaW5jcmVhc2UpLzEwMCkNCmRmIDwtIHRhYmxlWyxjKCJ5ZWFyIiwiYWRqX3ZhbHVlMiIpXQ0KY29sbmFtZXMoZGYpID0gYygiWWVhciIsICJhZGpfdmFsdWUiKSAjY2hhbmdpbmcgbmFtZSBmb3IgbGVmdF9qb2luDQpkZg0KYGBgDQoNCg0KTWVyZ2UgdGhlIGRhdGEgZnJhbWUgZnJvbSBhYm92ZSB3aXRoIHRoaXMgZGF0YSBmcmFtZSBiYXNlZCBvbiB0aGUgeWVhcg0KYGBge3J9DQpkYXRhX25hID0gbGVmdF9qb2luKGRmLCBkYXRhX25hLCBieSA9ICdZZWFyJykgDQpkYXRhX25hDQpgYGANCg0KDQpDb252ZXJ0IGFkanVzdGVkX3JldmVudWUgYW5kIGFkanVzdGVkIGJ1ZGdldCB0byBhbiBpbnRlZ2VyLCB3ZSBkbyB0aGlzIGJlY2F1c2UgdGhlcmUgYXJlIGEgbG90IG9mIGRlY2ltYWxzIGluIHNvbWUgY2FzZXMgYXMgdGhlIGFkanVzdG1lbnRzIGFyZSB2ZXJ5IHNwZWNpZmljDQpgYGB7cn0NCmRhdGFfbmEkYWRqdXN0ZWRfcmV2ZW51ZSA9IGFzLmludGVnZXIoZGF0YV9uYSRUb3RhbC5SZXZlbnVlL2RhdGFfbmEkYWRqX3ZhbHVlKQ0KZGF0YV9uYSRhZGp1c3RlZF9idWRnZXQgPSBhcy5pbnRlZ2VyKGRhdGFfbmEkYnVkZ2V0L2RhdGFfbmEkYWRqX3ZhbHVlKQ0KZGF0YV9uYQ0KYGBgDQoNCkRyb3AgYWxsIHRoZSBjb2x1bW5zIHRoYXQgd2Ugd2lsbCBub3QgYmUgdXNpbmcNCmBgYHtyfQ0KZGF0YV9uYSA9IHN1YnNldChkYXRhX25hLCBzZWxlY3QgPSAtYyhob21lcGFnZSwgaWQsIGltZGJfaWQsIG92ZXJ2aWV3LCBwb3N0ZXJfcGF0aCwgcmV2ZW51ZSwgc3RhdHVzLCB2aWRlbywgb3JpZ2luYWxfdGl0bGUsIG9yZ2luYWxfdGl0bGVfMiwgeWVhcl8yLCBZZWFyLCBhZGpfdmFsdWUsIGJ1ZGdldCwgVG90YWwuUmV2ZW51ZSwgdGl0bGUsIFgscHJvZHVjdGlvbl9jb3VudHJpZXMsIHByb2R1Y3Rpb25fY29tcGFuaWVzLCB0YWdsaW5lLCBzcG9rZW5fbGFuZ3VhZ2VzLCBnZW5yZXMsIGNhc3QsIGNyZXcsIGJlbG9uZ3NfdG9fY29sbGVjdGlvbiwgcHJvZF9jb21wX25hbWUsIGFkdWx0LCBvcmlnaW5hbF9sYW5ndWFnZSkpDQoNCmBgYA0KDQoNCkFsc28gZHJvcCBhbGwgdGhlIE5BIHZhbHVlcyBmcm9tIHRoZSBkYXRhc2V0LCBsZWF2aW5nIHVzIHdpdGggfjUwMDAgZGF0YSBwb2ludHMNCmBgYHtyfQ0KZGF0YV9uYSA8LSBkcm9wX25hKGRhdGFfbmEpDQpgYGANCg0KQ29udmVydCB0aGUgdmFyaWFibGUgdHlwZXMgdG8gZmFjdG9ycw0KYGBge3J9DQpkYXRhX25hJHJlbGVhc2VfZGF0ZSA8LSBhcy5EYXRlKGRhdGFfbmEkcmVsZWFzZV9kYXRlKQ0KZGF0YV9uYSA8LSBkYXRhX25hICU+JSBtdXRhdGVfaWYoaXMubG9naWNhbCxhcy5mYWN0b3IpDQpkYXRhX25hIDwtIGRhdGFfbmEgJT4lIG11dGF0ZV9pZihpcy5jaGFyYWN0ZXIsYXMuZmFjdG9yKQ0KaGVhZChkYXRhX25hKSANCmBgYA0KDQojIyMgUmVzdXJzaXZlIEZlYXR1cmUgU2VsZWN0aW9uIHdpdGggQm9ydXRhIG9uIE5BIERhdGFzZXQNCg0KDQpJdCBpcyB2ZXJ5IGVhc3kgdG8gcnVuIHRoZSBib3J1dGEgbW9kZWwsIGJ1dCBjYW4gYmUgdGltZSBjb25zdW1pbmcgaW4gbWFueSBjYXNlcywgc28gd2Ugd2lsbCBtYXhpbWl6ZSB0aGUgcnVucyBhdCAxMDAsIGJ5IHdoaWNoIGFsbW9zdCBhbGwgb2YgdGhlIHZhcmlhYmxlcyB3aWxsIGJlIGNsYXNzaWZpZWQgYXMgaW1wb3J0YW50IG9yIHVuaW1wb3J0YW50LiANCmBgYHtyfQ0KZmVhdHVyZVNlbGVjdGlvbl9uYSA8LSBCb3J1dGEoYWRqdXN0ZWRfcmV2ZW51ZSB+IC4sIGRhdGEgPSBkYXRhX25hLCBkb1RyYWNlID0gMiwgbWF4UnVucyA9IDEwMCkNCmBgYA0KDQpXZSBjYW4gcGxvdCB0aGUgZmVhdHVyZSBzZWxlY3Rpb24gdGhhdCBib3J1dGEgcmV0dXJucyB0byBnZXQgbW9yZSBpbnNpZ2h0IGFib3V0IHRoZSByZWxldmFuY2Ugb2YgY2VydGFpbiB2YXJpYWJsZXMNCmBgYHtyfQ0KcGxvdChmZWF0dXJlU2VsZWN0aW9uX25hLCBsYXMgPSAyLCBjZXguYXhpcyA9IDAuNSkNCmBgYA0KV2UgZ2V0IGEgbG90IG9mIGludGVyZXN0aW5nIGluaXRpYWwgcmVzdWx0cyB0ZWxsaW5nIHVzIHRoZSBhZGp1c3RlZF9idWRnZXQgaXMgdGhlIG1vc3QgcmVsZXZhbnQgdmFyaWFibGUgaW4gcHJlZGljdGluZyBhZGp1c3RlZF9yZXZlbnVlIGJ5IGZhciB3aGVuIGNvbXBhcmVkIHRvIGFsbW9zdCBhbGwgb3RoZXIgdmFyaWFibGVzLiBpc19pbl9jb2xsZWN0aW9uIGFuZCB2b3RlX2NvdW50IGFyZSBhbHNvIHZlcnkgaW1wb3J0YW50IHZhcmlhYmxlIGluIHByZWRpY3RpbmcgcmV2ZW51ZSBhbmQgbm90IGZhciBiZWhpbmQgYWRqdXN0ZWQgcmV2ZW51ZS4gVGhpcyBpbnR1aXRpdmVseSBtYWtlcyBhIGxvdCBvZiBzZW5zZSB0aGF0IGJvdGggb2YgdGhlc2UgdmFyaWFibGVzIHNob3VsZCBlZmZlY3QgcmV2ZW51ZS4gWW91IHdvdWxkIGRlZmluaXRlbHkgZXhwZWN0IGJ1ZGdldCBhbmQgcmV2ZW51ZSB0byBiZSBwb3NpdGl2ZWx5IGNvcnJlbGF0ZWQuIFlvdSB3b3VsZCBhbHNvIGV4cGVjdCB2b3RlIGNvdW50IHRvIGdvIHVwIGZvciAiZ29vZCBtb3ZpZXMiLCBhbmQgZ2VuZXJhbGx5IGEgc2VxdWVsIGlzIG1hZGUgd2hlbiB0aGUgZmlyc3QgbW92aWUgZG9lcyByZWFsbHkgd2VsbCBpbiB0ZXJtcyBvZiBib3ggb2ZmaWNlLiANCg0KVGhlcmUgYXJlIHZhcmlvdXMgdmFyaWFibGVzIChtYWlubHkgdGhlIGR1bW15IHZhcmlhYmxlcyBjcmVhdGVkIGZvciBwcm9kdWN0aW9uIGNvbXBhbnkgYW5kIGdlbnJlcykgdGhhdCBhcmUgZGVlbWVkIG5vdCBpbXBvcnRhbnQgbGlrZSBtZ20sIG11c2ljLHJrb19yYWRpbyBldGMuIE9mIGFsbCB0aGUgdW5pbXBvcnRhbnQgdmFyaWFibGVzIG9ubHkgMSBpcyBub3QgYSBkdW1teSB2YXJpYWJsZSBhbmQgdGhhdCBpcyBudW1fbGFuZ3VhZ2VzX3Nwb2tlbi4gDQoNClRoZXJlIGFyZSBhbHNvIDMgdmFyaWFibGVzIHRoYXQgYXJlIHRlbnRhdGl2ZSwgYnV0IHdlIGNhbiBzZWUgd2hlcmUgdGhleSBhcmUgYmFzZWQgb24gdGhlIGdyYXBoIGFib3ZlLCBzbyB3ZSBjYW4gY2hvb3NlIHdoYXQgdG8gZG8gd2l0aCB0aGVtLiANCg0KSW50ZXJlc3RpbmdseSB0aGUgcmFua2luZyBvZiB0aGUgdmFyaWFibGVzIGlzIG5vdCB0aGUgc2FtZSB3aGVuIHlvdSBydW4gaXQgbXVsdGlwbGUgdGltZXMuIEJhc2VkIG9uIHRoZSBzZWVkIHRoZSByYW5raW5nIGNoYW5nZXMgc2xpZ2h0bHksIG9idmlvdXNseSBhZGp1c3RlZF9idWRnZXQsIGlzX2luX2NvbGxlY3Rpb24gYW5kIHJlbGVhc2VfZGF0ZSB3ZXJlIGFsd2F5cyB0aGUgdG9wIDMgYnV0IGJldHdlZW4gdGhlIGdlbnJlcyBpdCB3b3VsZCBkaWZmZXIgc29tZXRpbWVzLiBTbyB3ZSBjYW5ub3QgZGVmaW5pdGl2ZWx5IHNheSB0aGF0IGlmIGEgbW92aWUncyBnZW5yZSBpcyBhZHZlbnR1cmUgb3IgZmFtaWx5IGl0IGhhcyBhIGJpZ2dlciBpbXBhY3Qgb24gcmV2ZW51ZS4gDQoNClNvIHdlIHdpbGwgZml4IHRoZSB0ZW50YXRpdmUgdmFyaWFibGVzIChpdCB3aWxsIGFzc2lnbiB0aGUgdGVudGF0aXZlIHZhcmlhYmxlcyB1bmRlciBpbXBvcnRhbnQgb3IgdW5pbXBvcnRhbnQgdXNpbmcgdGhlIGluZm9ybWF0aW9uIGl0IGFscmVhZHkgaGFzKSBhbmQgd2Ugd2lsbCBnZXQgdGhlIGZvcm11bGEgdGhhdCB3ZSB3aWxsIHBsdWcgaW50byB0aGUgcmFuZG9tIGZvcmVzdCBtb2RlbA0KYGBge3J9DQpmZWF0dXJlU2VsZWN0aW9uRmluYWxfbmEgPC0gVGVudGF0aXZlUm91Z2hGaXgoZmVhdHVyZVNlbGVjdGlvbl9uYSkNCmdldE5vblJlamVjdGVkRm9ybXVsYShmZWF0dXJlU2VsZWN0aW9uRmluYWxfbmEpDQpgYGANCg0KYGBge3J9DQpmZWF0dXJlU2VsZWN0aW9uX25hDQpgYGANCg0KIyMjIEZlYXR1cmUgU2VsZWN0aW9uIHVzaW5nIEFOT1ZBIHdpdGggdGhlIE5BIGRhdGEgc2V0DQoNCkRvIGEgb25lIHdheSBBTk9WQSBvZiBhbGwgdGhlIHZhcmlhYmxlcyBhZ2FpbnN0IGFkanVzdGVkX3JldmVudWUNCmBgYHtyfQ0Kb25lLndheV9uYSA8LSBhb3YoYWRqdXN0ZWRfcmV2ZW51ZSB+IC4sIGRhdGEgPSBkYXRhX25hKQ0Kc3VtbWFyeShvbmUud2F5X25hKQ0KYGBgDQpBcyB3ZSBrbm93IGZyb20gRE1BIHRoZSBjdXRvZmYgcCB2YWx1ZSBpcyAwLjA1IHNvIGFueXRoaW5nIGFib3ZlIHRoYXQgaXMgbm90IGltcG9ydGFudCBhbmQgYW55dGhpbmcgYmVsb3cgdGhhdCBpcyBpbXBvcnRhbnQuIFRoZSBzdGFycyBiZXNpZGUgdGhlIHZhcmlhYmxlIGFsc28gdGVsbCB1cyBsZXZlbCBvZiBpbXBvcnRhbmNlLCBidXQgZG8gbm90IGdpdmUgdXMgYSBjbGVhciBvdXRjb21lIG9uIHdoaWNoIHZhcmlhYmxlcyBhcmUgdGhlIG1vc3QgaW1wb3J0YW50IGFzIGl0IGp1c3Qgc2F5cyA8MmUtMTYgdGVsbGluZyB1cyB0aGF0IHRoZXkgYXJlIHZlcnkgaW1wb3J0YW50IGJ1dCBub3QgcmFua2luZyB0aGVtLiANCg0KVGhlcmUgYXJlIDYgdmFyaWFibGVzIHRoYXQgaGF2ZSBhIHAgdmFsdWUgb2YgPDJlLTE2LCBhbmQgdGhyZWUgb2YgdGhlbSBhcmUgdm90ZV9jb3VudCwgYWRqdXN0ZWRfYnVkZ2V0IGFuZCBpc19pbl9jb2xsZWN0aW9uICBzbyB0aGUgdG9wIDMgYXJlIHRoZSBzYW1lIGZvciBlYWNoIGZlYXR1cmUgc2VsZWN0aW9uLiANCg0KVGhlIG51bWJlciBvZiBmZWF0dXJlcyBzZWxlY3RlZCBpcyByZWxhdGl2ZWx5IHRoZSBzYW1lIGJ1dCB0aGVyZSBhcmUgZGlmZmVyZW5jZXMgaW4gdGhlIGZlYXR1cmVzIHNlbGVjdGVkIGZvciBleGFtcGxlIHVuc3BlY2lmaWVkIGNhc3QgaXMgZGVlbWVkIG5vdCBpbXBvcnRhbnQgYnkgdGhlIEFOT1ZBIG1vZGVsIGJ1dCBpdCBpcyBpbXBvcnRhbnQgYWNjb3JkaW5nIHRvIHRoZSBCb3J1dGEgbW9kZWwuIA0KDQpBZnRlciBydW5uaW5nIHRoZSBtb2RlbHMgd2UgY2FuIHNlZSB3aGljaCBvbmUgZ2VuZXJhbGx5IHBlcmZvcm1lZCBiZXR0ZXINCg0KDQojIyMgUmFuZG9tIEZvcmVzdCBNb2RlbCB1c2luZyBCb3J1dGEgRmVhdHVyZXMNCg0KQXMgbWVudGlvbmVkIGF0IHRoZSBiZWdpbm5pbmcgb2YgdGhlIG5vdGVib29rIHdlIHdhbnRlZCB0byBydW4gdGhlIG1vZGVsIG11bHRpcGxlIHRpbWVzIGJlY2F1c2Ugb2YgdGhlIHJhbmdlIGluIGVycm9yIGJhc2VkIG9uIHRoZSB0cmFpbiB0ZXN0IHNwbGl0LiBmb3IgdGhlIE5BIG1vZGVsIHdlIHdpbGwgYmUgcnVubmluZyBpdCAzMCB0aW1lcyBhcyBpdCBkaWQgbm90IHRha2UgdG9vb28gbG9uZyB0byBydW4uICBUaGUgcmVhc29uIHdlIGFyZSB1c2luZyAyMDAgdHJlZXMgd2FzIGV4cGxhaW5lZCBpbiB0aGUgc2Vjb25kIG5vdGVib29rIGFuZCB0aGF0IGlzIHdoZXJlIHdlIHBsb3R0ZWQgdGhlIHJhbmRvbSBmb3Jlc3QgdG8gc2VlIHdoYXQgdGhlIGlkZWFsIG51bWJlciBvZiB0cmVlcyB3b3VsZCBiZS4gDQpgYGB7cn0NCiNXZSB3YW50IHRvIHN0cm9lIHRoZSBlcnJvciBtZXRyaWNzIHRvIGFuYWx5emUgbGF0ZXIgDQpybXNlX25hX2JvcnV0YSA8LSBjKCkNCm1hcGVfbmFfYm9ydXRhIDwtIGMoKQ0KDQpmb3IoaSBpbiAxOjMwKXsNCiAgDQogICMgVHJhaW4gdGVzdCBzcGxpdA0KICBudW1fc2FtcGxlcyA9IGRpbShkYXRhX25hKVsxXQ0KICBzYW1wbGluZy5yYXRlID0gMC44DQogIHRyYWluaW5nIDwtIHNhbXBsZSgxOm51bV9zYW1wbGVzLCBzYW1wbGluZy5yYXRlICogbnVtX3NhbXBsZXMsIHJlcGxhY2U9RkFMU0UpDQogIHRyYWluaW5nU2V0IDwtIHN1YnNldChkYXRhX25hW3RyYWluaW5nLCBdKQ0KICB0ZXN0aW5nIDwtIHNldGRpZmYoMTpudW1fc2FtcGxlcyx0cmFpbmluZykNCiAgdGVzdGluZ1NldCA8LSBzdWJzZXQoZGF0YV9uYVt0ZXN0aW5nLCBdKQ0KICANCiAgI1RyYWluIHRoZSBtb2RlbA0KICByYW5kb21Gb3Jlc3RNb2RlbCA8LSByYW5kb21Gb3Jlc3QoYWRqdXN0ZWRfcmV2ZW51ZSB+IHBvcHVsYXJpdHkgKyByZWxlYXNlX2RhdGUgKyBydW50aW1lICsgdm90ZV9hdmVyYWdlICsgDQogICAgdm90ZV9jb3VudCArIG1ldGVyU2NvcmUgKyBtZXRlckNsYXNzICsgaXNfaW5fY29sbGVjdGlvbiArIA0KICAgIGhhc190YWdsaW5lICsgbnVtYmVyX29mX2Nhc3QgKyBmZW1hbGVfY2FzdCArIG1hbGVfY2FzdCArIA0KICAgIHVuc3BlY2lmaWVkX2Nhc3QgKyBudW1iZXJfb2ZfY3JldyArIGZlbWFsZV9jcmV3ICsgbWFsZV9jcmV3ICsgDQogICAgdW5zcGVjaWZpZWRfY3JldyArIGNvbWVkeSArIGhvcnJvciArIGFjdGlvbiArIGRyYW1hICsgZmFudGFzeSArIA0KICAgIHRocmlsbGVyICsgYW5pbWF0aW9uICsgYWR2ZW50dXJlICsgcm9tYW5jZSArIGZhbWlseSArIHR3ZW50aWV0aF9jZW50dXJ5ICsgDQogICAgd2FybmVyX2Jyb3MgKyB1bml2ZXJzYWwgKyB3YWx0X2Rpc25leSArIHByb2Rfc2l6ZSArIG51bV9wcm9kdWN0aW9uX2NvbXBhbmllcyArIA0KICAgIHByb2R1Y3Rpb25fY291bnRyeSArIGFkanVzdGVkX2J1ZGdldCwgZGF0YT10cmFpbmluZ1NldCwgbnRyZWU9MjAwKQ0KICANCiAgI0NhbGN1bGF0ZSB0aGUgZXJyb3INCiAgcHJlZGljdGlvbnMgPC0gcHJlZGljdChyYW5kb21Gb3Jlc3RNb2RlbCwgdGVzdGluZ1NldCkNCiAgZXJyb3IgPSBwcmVkaWN0aW9ucyAtIHRlc3RpbmdTZXQkYWRqdXN0ZWRfcmV2ZW51ZQ0KICBtc2UgPSBtZWFuKGVycm9yXjIpDQogIHJtc2VfbmFfYm9ydXRhW2ldIDwtIHNxcnQobXNlKQ0KICBlcnJvcnBjdCA8LSAoKGFicyh0ZXN0aW5nU2V0JGFkanVzdGVkX3JldmVudWUgLSBwcmVkaWN0aW9ucykpL3Rlc3RpbmdTZXQkYWRqdXN0ZWRfcmV2ZW51ZSkNCiAgbWFwZV9uYV9ib3J1dGFbaV0gPC0gbWVhbihlcnJvcnBjdCkNCiAgDQp9DQoNCmBgYA0KDQpgYGB7cn0NCnJtc2VfbmFfYm9ydXRhDQptYXBlX25hX2JvcnV0YQ0KYGBgDQpXZSBpbml0aWFsbHkgc2VlIHRoYXQgUk1TRSBpcyByZWFsbHkgbGFyZ2UgJDEwME0gcmFuZ2UgYnV0IHdlIGtub3cgdGhhdCB0aGUgZXJyb3IgaXMgc2lnbmlmaWNhbnRseSBleGFnZ2VyYXRlZCBhcyB0aGUgZXJyb3JzIGFyZSBhbHJlYWR5IGJpZywgcGx1cyB0aGVyZSBhcmUgNUsgZGF0YSBwb2ludHMgc28gaXQgaXMgbm90IGEgZ29vZCBtZWFzdXJlIG9mIGVycm9yLiANCkZvciB0aGlzIG1vZGVsIHdlIHdlcmUgYWJsZSB0byBydW4gMzAgdHJhaW4gdGVzdCBzcGxpdHMgYW5kIHdlIGNhbiBzZWUgdGhhdCB0aGUgcmFuZ2UgaW4gTUFQRSBpcyByZWFsbHkgYmlnIGZyb20gMTQlIHRvIDE1MiUgd2UgY2FuIGNhbGN1bGF0ZSB0aGUgYXZlcmFnZSBvbiB0aGF0IHRvIHNlZSANCg0KYGBge3J9DQptZWFuKHJtc2VfbmFfYm9ydXRhKQ0KbWVhbihtYXBlX25hX2JvcnV0YSkNCmBgYA0KVGhlIGF2ZXJhZ2UgTUFQRSBpcyA2OCUgd2hpY2ggaXMgbm90IHRoZSBiZXN0IGJ1dCB3ZSBhbHJlYWR5IGtub3cgdGhhdCB3ZSBhcmUgbm90IGFibGUgdG8gcHJlZGljdCB0aGUgYm94IG9mZmljZSB3aXRoIG91ciB2YXJpYWJsZXMgdmVyeSBhY2N1cmF0ZWx5IGJ1dCB3ZSBjYW4gY29tcGFyZSBmZWF0dXJlIHNlbGVjdGlvbiBtb2RlbHMgDQoNCg0KVG8gY29tcGFyZSB0aGlzIG1vZGVsIGFnYWluc3QgdGhlIG1vZGVscyBkb25lIGluIHRoZSBzZWNvbmQgUiBub3RlYm9vayB3ZSB3aWxsIHRha2UgUk1TRS8gYXZlcmFnZSBhZGp1c3RlZF9yZXZlbnVlDQpgYGB7cn0NCm1lYW4ocm1zZV9uYV9ib3J1dGEpL21lYW4oZGF0YV9uYSRhZGp1c3RlZF9yZXZlbnVlKQ0KYGBgDQpUaGlzIG51bWJlciBpcyBzaWduaWZpY2FudGx5IGxlc3MgdGhhbiB0aGUgbnVtYmVycyB0aGF0IHdlIGdvdCBpbiB0aGUgc2Vjb25kIG5vdGVib29rLCB0aGlzIHNob3dzIHRoYXQgZmVhdHVyZSBzZWxlY3Rpb24gZGVmaW5pdGVseSBpbXByb3ZlcyBhY2N1cmFjeS4gVGhlcmUgaXMgYWxzbyB0aGUgYWRkZWQgYmVuZWZpdCBvZiBsZXNzIGNvbXB1dGF0aW9uYWwgcG93ZXIgcmVxdWlyZWQgYXMgd2UgYXJlIGFuYWx5emluZyBsZXNzIGZlYXR1cmVzLiBXZSBjYW4gYWxzbyBzZWUgdGhlIGVmZmVjdCBSTVNFIGhhcyBhbmQgaG93IG11Y2ggaXQgZXhhZ2dlcmF0ZWQgdGhlIGVycm9yIGJ5IH4xNSUgaW4gdGhpcyBjYXNlLg0KDQoNCiMjIyBSYW5kb20gRm9yZXN0IE1vZGVsIHVzaW5nIEFOT1ZBIEZlYXR1cmVzDQoNClNpbWlsYXIgdG8gdGhlIG1vZGVsIGFib3ZlIHdlIHdpbGwgYmUgcnVubmluZyBpdCAzMCB0aW1lcyB3aXRoIDIwMCB0cmVlcy4gDQpgYGB7cn0NCg0KI1N0b3JlIHRoZSBlcnJvciBtZXRyaWNzDQpybXNlX25hX2Fub3ZhIDwtIGMoKQ0KbWFwZV9uYV9hbm92YSA8LSBjKCkNCg0KZm9yKGkgaW4gMTozMCl7DQogIA0KICAjVHJhaW4gdGVzdCBzcGxpdA0KICBudW1fc2FtcGxlcyA9IGRpbShkYXRhX25hKVsxXQ0KICBzYW1wbGluZy5yYXRlID0gMC44DQogIHRyYWluaW5nIDwtIHNhbXBsZSgxOm51bV9zYW1wbGVzLCBzYW1wbGluZy5yYXRlICogbnVtX3NhbXBsZXMsIHJlcGxhY2U9RkFMU0UpDQogIHRyYWluaW5nU2V0IDwtIHN1YnNldChkYXRhX25hW3RyYWluaW5nLCBdKQ0KICB0ZXN0aW5nIDwtIHNldGRpZmYoMTpudW1fc2FtcGxlcyx0cmFpbmluZykNCiAgdGVzdGluZ1NldCA8LSBzdWJzZXQoZGF0YV9uYVt0ZXN0aW5nLCBdKQ0KICANCiAgI1RyYWluIHRoZSBtb2RlbA0KICByYW5kb21Gb3Jlc3RNb2RlbCA8LSByYW5kb21Gb3Jlc3QoYWRqdXN0ZWRfcmV2ZW51ZSB+IC4tIG1ldGVyU2NvcmUgLSBudW1fc3Bva2VuX2xhbmd1YWdlcyAtIG51bWJlcl9vZl9jYXN0IC0gdW5zcGVjaWZpZWRfY2FzdCAgLSBhY3Rpb24gLSBkb2N1bWVudGFyeSAtIG15c3RlcnkgLSB3YXIgLSBtdXNpYyAtIHR2X21vdmllIC0gZm9yZWlnbiAtIG1nbSAtIHdhcm5lcl9icm9zIC0gcmtvX3JhZGlvIC0gbmV3X2xpbmVfY2luZW1hIC0gcHJvZHVjdGlvbl9jb3VudHJ5LCBkYXRhPXRyYWluaW5nU2V0LCBudHJlZT0yMDApDQogIA0KICAjQ2FsY3VhbHRlIHRoZSBlcnJvcg0KICBwcmVkaWN0aW9ucyA8LSBwcmVkaWN0KHJhbmRvbUZvcmVzdE1vZGVsLCB0ZXN0aW5nU2V0KQ0KICBlcnJvciA9IHByZWRpY3Rpb25zIC0gdGVzdGluZ1NldCRhZGp1c3RlZF9yZXZlbnVlDQogIG1zZSA9IG1lYW4oZXJyb3JeMikNCiAgcm1zZV9uYV9hbm92YVtpXSA8LSBzcXJ0KG1zZSkNCiAgZXJyb3JwY3QgPC0gKChhYnModGVzdGluZ1NldCRhZGp1c3RlZF9yZXZlbnVlIC0gcHJlZGljdGlvbnMpKS90ZXN0aW5nU2V0JGFkanVzdGVkX3JldmVudWUpDQogIG1hcGVfbmFfYW5vdmFbaV0gPC0gbWVhbihlcnJvcnBjdCkNCiAgDQp9DQpgYGANCg0KDQpgYGB7cn0NCnJtc2VfbmFfYW5vdmENCm1hcGVfbmFfYW5vdmENCmBgYA0KV2UgaW5pdGlhbGx5IHNlZSB0aGF0IFJNU0UgaXMgcmVhbGx5IGxhcmdlICQxMDBNIHJhbmdlIGJ1dCB3ZSBrbm93IHRoYXQgdGhlIGVycm9yIGlzIHNpZ25pZmljYW50bHkgZXhhZ2dlcmF0ZWQgYXMgdGhlIGVycm9ycyBhcmUgYWxyZWFkeSBiaWcsIHBsdXMgdGhlcmUgYXJlIDVLIGRhdGEgcG9pbnRzIHNvIGl0IGlzIG5vdCBhIGdvb2QgbWVhc3VyZSBvZiBlcnJvci4gDQpsb29raW5nIGF0IHRoZSBNQVBFJ3MgcmFuZ2UgaXMgc21hbGxlciB0aGFuIHRoZSByYW5nZSBvZiBNQVBFIGZvciB0aGUgYm9ydXRhIG1vZGVsLCByYW5naW5nIGZyb20gMTYlIHRvIDExMSUNCg0KYGBge3J9DQptZWFuKHJtc2VfbmFfYW5vdmEpDQptZWFuKG1hcGVfbmFfYW5vdmEpDQpgYGANCkFzIHdlIGNvdWxkIGFscmVhZHkgc2VlIHRoZSBmZWF0dXJlcyBmcm9tIEFOT1ZBIHBlcmZvcm1lZCBiZXR0ZXIuIE5vdyBzb21ldGhpbmcgdGhhdCB3ZSByZWFsaXplZCBhZnRlciBydW5uaW5nIHRoZSBtb2RlbCBpcyB0aGF0IHdlIGNhbm5vdCBhY3R1YWxseSBjb21wYXJlIHJpZ2h0IG5vdywgYmVjYXVzZSB0aGUgdHJhaW4gdGVzdCBzcGxpdCBpcyBkaWZmZXJlbnQgc28gdGhpcyByZXN1bHQgY291bGQgYmUgYmVjYXVzZSBvZiB0aGUgdHJhaW4gdGVzdCBzcGxpdCBidXQgbm90IGJlY2F1c2Ugb2YgdGhlIHRyYWluIHRlc3Qgc3BsaXQgcmF0aGVyIHRoYW4gdGhlIHJlc3VsdHMuIFRoaXMgaXMgc29tZXRoaW5nIHdlIHdhbnRlZCB0byBmaXggYnV0IHRoaXMgbW9kZWwgbmVlZHMgdG8gcnVuIG92ZXJuaWdodCBhbmQgbm93IHdlIGRvIG5vdCBoYXZlIGFueSBtb3JlIHRpbWUgYXMgd2UgaGF2ZSB0byBzdWJtaXQNCg0KVG8gY29tcGFyZSB0aGlzIG1vZGVsIGFnYWluc3QgdGhlIG1vZGVscyBkb25lIGluIHRoZSBzZWNvbmQgUiBub3RlYm9vayB3ZSB3aWxsIHRha2UgUk1TRS8gYXZlcmFnZSBhZGp1c3RlZF9yZXZlbnVlDQpgYGB7cn0NCm1lYW4ocm1zZV9uYV9hbm92YSkvbWVhbihkYXRhX25hJGFkanVzdGVkX3JldmVudWUpDQpgYGANClRoaXMgZXJyb3IgbWV0cmljIGlzIHZlcnkgY29tcGFyYWJsZSB0byB0aGUgQU5PVkEgbW9kZWwgZG9uZSBpbiBub3RlYm9vayB0d28uIFRoZSByZWFzb24gZm9yIHRoYXQgaXMgYmVjYXVzZSBpdCBpcyBkb25lIG9uIHRoZSBzYW1lIGRhdGEgc2V0IHdoaWNoIG1lYW5zIHRoYXQgd2UgY2FuIHNlZSB0aGUgaW1wYWN0IHRoYXQganVzdCBhZGp1c3RpbmcgcmV2ZW51ZSBhbmQgYnVkZ2V0IGZvciB0aW1lIGhhcyBvbiB0aGUgbW9kZWwuIFNvIGluIHRoaXMgY2FzZSB3ZSBhcmUgZ2V0dGluZyBhIHJtc2UgYXMgcGVyY2VudGFnZSBvZiByZXZlbnVlIGF0IDg0JSB3aGVyZSBhcyBpbiBib29rIHR3byB3ZSBnb3QgYSByZXN1bHQgb2YgMTg5JSwgc28gdGhlIGFkanVzdG1lbnQgZm9yIHRpbWUgdmFsdWUgb2YgbW9uZXkgZGVmaW5hdGVseSBoYWQgYSB2ZXJ5IHBvc2l0aXZlIGltcGFjdCBvbiB0aGUgYWNjdXJhY3kgb2YgdGhlIG1vZGVsLg0KDQoNCiMjRmVhdHVyZSBTZWxlY3Rpb24gb24gaW1wdWF0ZWQgZGF0YQ0KDQpSZWFkIGluIHRoZSBmaWxlDQpgYGB7cn0NCmRhdGFfcmYgPSByZWFkLmNzdigicmZfaW1wdXRhdGlvbnNfMy5jc3YiKQ0KaGVhZChkYXRhX3JmKQ0KYGBgDQoNCg0KRG8gdGhlIHNhbWUgYmFzaWMgdHlwZSBjb252ZXJzaW9ucw0KYGBge3J9DQojY29udmVydGluZyBjbGFzc2VzIA0KZGF0YV9yZiA8LSBtdXRhdGVfaWYoZGF0YV9yZiwgaXMuZmFjdG9yLCBhcy5jaGFyYWN0ZXIoKSkNCmRhdGFfcmYkYnVkZ2V0IDwtIGFzLm51bWVyaWMoZGF0YV9yZiRidWRnZXQpDQpkYXRhX3JmJFRvdGFsLlJldmVudWUgPSBhcy5udW1lcmljKGRhdGFfcmYkVG90YWwuUmV2ZW51ZSkNCmhlYWQoZGF0YV9yZikNCmBgYA0KDQoNCkV4dHJhY3QgeWVhciBmcm9tIHRoZSByZWxlYXNlIGRhdGUgY29sdW1uIA0KYGBge3J9DQpkYXRhX3JmJFllYXIgPSBzdHJfZXh0cmFjdChkYXRhX3JmJHJlbGVhc2VfZGF0ZSwgIlxcZHs0fSIpDQpoZWFkKGRhdGFfcmYpDQpgYGANCg0KDQoNCkZpbmQgdGhlIGFkanVzdG1lbnQgdmFsdWUgYmFzZWQgb24gdGhlIGJhc2UgeWVhciBvZiAyMDIwIChzY3JhcGUgdGhlIFVTIEJlYXVyZWEgd2Vic2l0ZSBmb3IgaW5mb3JtYXRpb24gYWJvdXQgdGhpcykNCmBgYHtyfQ0KdGFibGUgPSBpbmZsYXRpb25fYWRqdXN0KDIwMjApDQp0YWJsZQ0KYGBgDQoNCg0KRXh0cmFjdCB0aGUgYWRqX3ZhbHVlIGluIGEgc2ltcGxlIGZvcm0gYW5kIHB1dCBpdCBpbnRvIGEgZGF0YSBmcmFtZSBzbyB3ZSBjYW4gam9pbiBpdCAgDQpgYGB7cn0NCnRhYmxlIDwtIGFzLmRhdGEuZnJhbWUodGFibGUpDQp0YWJsZSRhZGpfdmFsdWUyIDwtICgoMTAwICsgdGFibGUkcGN0X2luY3JlYXNlKS8xMDApDQpkZiA8LSB0YWJsZVssYygieWVhciIsImFkal92YWx1ZTIiKV0NCmNvbG5hbWVzKGRmKSA9IGMoIlllYXIiLCAiYWRqX3ZhbHVlIikgI2NoYW5naW5nIG5hbWUgZm9yIGxlZnRfam9pbg0KZGYNCmBgYA0KDQoNCkpvaW4gdGhlIG1haW4gZGF0YSB3aXRoIHRoZSBhZGp1c3RtZW50IGFuZCBqb2luIGJ5IFllYXINCmBgYHtyfQ0KZGF0YV9yZiA9IGxlZnRfam9pbihkZiwgZGF0YV9yZiwgYnkgPSAnWWVhcicpIA0KZGF0YV9yZg0KYGBgDQoNCkNhbGN1bGF0ZSB0aGUgYWRqdXN0ZWQgQnVkZ2V0IGFuZCBSZXZlbnVlLCBhbmQgY29udmVydCB0aGUgdmFsdWUgdG8gYW4gaW50ZWdlcg0KYGBge3J9DQpkYXRhX3JmJGFkanVzdGVkX3JldmVudWUgPSBhcy5pbnRlZ2VyKGRhdGFfcmYkVG90YWwuUmV2ZW51ZS9kYXRhX3JmJGFkal92YWx1ZSkNCmRhdGFfcmYkYWRqdXN0ZWRfYnVkZ2V0ID0gYXMuaW50ZWdlcihkYXRhX3JmJGJ1ZGdldC9kYXRhX3JmJGFkal92YWx1ZSkNCmRhdGFfcmYNCmBgYA0KVGhpcyBpbnRyb2R1Y2VkIHNvbWUgTkFzIHNvIHdlIHdpbGwgcmVtb3ZlIHRoZW0gDQpgYGB7cn0NCmRhdGFfcmYgPC0gZHJvcF9uYShkYXRhX3JmKQ0KYGBgDQoNCg0KRHJvcCB0aGUgY29sdW1ucyB0aGF0IHdlIHdpbGwgbm90IGJlIG5lZWRpbmcgYW55bW9yZQ0KYGBge3J9DQpkYXRhX3JmIDwtIHN1YnNldChkYXRhX3JmLCBzZWxlY3QgPSAtYyhZZWFyLCBhZGpfdmFsdWUsIGJ1ZGdldCwgVG90YWwuUmV2ZW51ZSwgdGl0bGUsIFgsIG9yaWdpbmFsX2xhbmd1YWdlKSkNCmBgYA0KDQpDaGFuZ2UgdHlwZSBvZiB2YXJpYWJsZXMgdG8gZmFjdG9yIGFuZCBkYXRlIGFjY29yZGluZ2x5IA0KYGBge3J9DQpkYXRhX3JmJHJlbGVhc2VfZGF0ZSA8LSBhcy5EYXRlKGRhdGFfcmYkcmVsZWFzZV9kYXRlKQ0KZGF0YV9yZiA8LSBkYXRhX3JmICU+JSBtdXRhdGVfaWYoaXMubG9naWNhbCxhcy5mYWN0b3IpDQpkYXRhX3JmIDwtIGRhdGFfcmYgJT4lIG11dGF0ZV9pZihpcy5jaGFyYWN0ZXIsYXMuZmFjdG9yKQ0KaGVhZChkYXRhX3JmKQ0KYGBgDQoNCg0KIyMjIFJlc3Vyc2l2ZSBGZWF0dXJlIFNlbGVjdGlvbiB3aXRoIEJvcnV0YSBvbiBJbXB1dGVkIERhdGFzZXQNCg0KDQpJdCBpcyB2ZXJ5IGVhc3kgdG8gcnVuIHRoZSBib3J1dGEgbW9kZWwsIGJ1dCBjYW4gYmUgdGltZSBjb25zdW1pbmcgaW4gbWFueSBjYXNlcywgc28gd2Ugd2lsbCBtYXhpbWl6ZSB0aGUgcnVucyBhdCA2MCwgYnkgd2hpY2ggYWxtb3N0IGFsbCBvZiB0aGUgdmFyaWFibGVzIHdpbGwgYmUgY2xhc3NpZmllZCBhcyBpbXBvcnRhbnQgb3IgdW5pbXBvcnRhbnQuIA0KDQpgYGB7cn0NCmZlYXR1cmVTZWxlY3Rpb25fcmYgPC0gQm9ydXRhKGFkanVzdGVkX3JldmVudWUgfiAuLCBkYXRhID0gZGF0YV9yZiwgZG9UcmFjZSA9IDIsIG1heFJ1bnMgPSAzMCkNCmBgYA0KQWZ0ZXIgMzAgcnVucyB3aGljaCB0b29rIGFyb3VuZCAxLjUgaG91cnMgaXQgd2FzIGFibGUgdG8gY2xhc3NpZnkgYWxsIGJ1dCAzIGZlYXR1cmVzIHdoaWNoIGlzIGdvb2QNCg0KUGxvdCB0aGUgZmVhdHVyZSBzZWxlY3Rpb24gdG8gZ2V0IG1vcmUgaW5mb3JtYXRpb24gYWJvdXQgaXQNCmBgYHtyfQ0KcGxvdChmZWF0dXJlU2VsZWN0aW9uX3JmLCBsYXMgPSAyLCBjZXguYXhpcyA9IDAuNSkNCmBgYA0KDQpNYWpvcml0eSBvZiB0aGUgZmVhdHVyZXMgYXJlIGNsYXNzaWZpZWQgdGhlIHNhbWUgd2F5IGZvciBib3RoIHRoZSBkYXRhc2V0LCBzbyB0aGUgaW1wdXRlZCBkYXRhIHNldCBkb2VzIG5vdCBzaWduaWZpY2FudGx5IGNoYW5nZSB0aGUgZmVhdHJ1ZXMgdXNlZC4gDQoNCg0KDQpEbyBhIHRlbnRhdGl2ZSBmaXggdG8gdGhlIHRlbnRhdGl2ZSBmZWF0dXJlcyAoYXNzaWduIHRoZW0gdG8gZWl0aGVyIGltcG9ydGFudCBvciB1bmltcG9ydGFudCkgYW5kIGdldCB0aGUgZm9ybXVsYQ0KYGBge3J9DQpmZWF0dXJlU2VsZWN0aW9uRmluYWxfcmYgPC0gVGVudGF0aXZlUm91Z2hGaXgoZmVhdHVyZVNlbGVjdGlvbl9yZikNCmdldE5vblJlamVjdGVkRm9ybXVsYShmZWF0dXJlU2VsZWN0aW9uRmluYWxfcmYpDQpgYGANCmp1c3QgdGFrZSBhIGxvb2sgYXQgdGhlIGZlYXR1cmUgc2VsZWN0aW9uIG92ZXJhbGwgYWdhaW4NCmBgYHtyfQ0KZmVhdHVyZVNlbGVjdGlvbl9yZg0KYGBgDQojIyMgRmVhdHVyZSBTZWxlY3Rpb24gd2l0aCBBTk9WQSBvbiBJbXB1dGVkIERhdGFzZXQNCg0KRG8gYSBvbmUgd2F5IEFOT1ZBIG9mIGFsbCB0aGUgdmFyaWFibGVzIGFnYWluc3QgYWRqdXN0ZWRfcmV2ZW51ZQ0KYGBge3J9DQpvbmUud2F5X3JmPC0gYW92KGFkanVzdGVkX3JldmVudWUgfiAuLCBkYXRhID0gZGF0YV9yZikNCnN1bW1hcnkob25lLndheV9yZikNCmBgYA0KT25jZSBhZ2FpbiBtYWpvcml0eSBvZiB0aGUgZmVhdHVyZXMnIGNsYXNzaWZpY2F0aW9uIGlzIHRoZSBzYW1lIGFzIHRoZSBBTk9WRSBvbiBkYXRhc2V0IHdoZXJlIGFsbCB0aGUgTkEncyB3ZXJlIGRyb3BwZWQsIGJ1dCB0aGVlIGlzIGFuIGludGVyZXN0aW5nIGRpZmZlcmVuY2Ugd2hlcmUgbWV0ZXIgU2NvcmUgd2FzIHZlcnkgc2lnbmlmaWNhbnQgaW4gdGhlIE5BIGRhdGFzZXQsIGJ1dCBvbiB0aGUgaW1wdXRlZCBvbmUgaXQgaXMgbm90IGltcG9ydGFudCwgdGhpcyBjb3VsZCBwb3NzaWJseSBtZWFuIHRoYXQgdGhlIGltcHV0YXRpb25zIGZvciB0aGlzIHZhcmlhYmxlIGFyZSBub3QgdmVyeSBhY2N1cmF0ZSBvciB0aGlzIGNhbiBtZWFuIHRoYXQgbWV0ZXJDbGFzcyBoYWQgYSBiaWcgaW1wYWN0IG9uIHRoZSBtb3ZpZXMgaW4gdGhlIGZpcnN0IGRhdGFzZXQgYnV0IG5vdCBhcyBtdWNoIG9mIGFuIGltcGFjdCB3aGVuIHRoZSBvdGhlciAxNUsgbW92aWVzIHdlcmUgaW5jbHVkZWQuDQoNCg0KDQojIyMgUmFuZG9tIEZvcmVzdCBNb2RlbCB1c2luZyBCb3J1dGEgRmVhdHVyZXMNCg0KQXMgbWVudGlvbmVkIGF0IHRoZSBiZWdpbm5pbmcgb2YgdGhlIG5vdGVib29rIHdlIHdhbnRlZCB0byBydW4gdGhlIG1vZGVsIG11bHRpcGxlIHRpbWVzIGJlY2F1c2Ugb2YgdGhlIHJhbmdlIGluIGVycm9yIGJhc2VkIG9uIHRoZSB0cmFpbiB0ZXN0IHNwbGl0LiBmb3IgdGhlIE5BIG1vZGVsIHdlIHdpbGwgYmUgcnVubmluZyBpdCAzIHRpbWVzIGFzIGl0IHRvb2sgYSBsb25nIHRvIHJ1bi4gIFRoZSByZWFzb24gd2UgYXJlIHVzaW5nIDIwMCB0cmVlcyB3YXMgZXhwbGFpbmVkIGluIHRoZSBzZWNvbmQgbm90ZWJvb2sgYW5kIHRoYXQgaXMgd2hlcmUgd2UgcGxvdHRlZCB0aGUgcmFuZG9tIGZvcmVzdCB0byBzZWUgd2hhdCB0aGUgaWRlYWwgbnVtYmVyIG9mIHRyZWVzIHdvdWxkIGJlLg0KDQpgYGB7cn0NCg0Kcm1zZV9yZl9ib3J1dGEgPC0gYygpDQptYXBlX3JmX2JvcnV0YSA8LSBjKCkNCg0KZm9yKGkgaW4gMTozKXsNCiAgIyBUcmFpbiB0ZXN0IHNwbGl0DQogIG51bV9zYW1wbGVzID0gZGltKGRhdGFfcmYpWzFdDQogIHNhbXBsaW5nLnJhdGUgPSAwLjgNCiAgdHJhaW5pbmcgPC0gc2FtcGxlKDE6bnVtX3NhbXBsZXMsIHNhbXBsaW5nLnJhdGUgKiBudW1fc2FtcGxlcywgcmVwbGFjZT1GQUxTRSkNCiAgdHJhaW5pbmdTZXQgPC0gc3Vic2V0KGRhdGFfcmZbdHJhaW5pbmcsIF0pDQogIHRlc3RpbmcgPC0gc2V0ZGlmZigxOm51bV9zYW1wbGVzLHRyYWluaW5nKQ0KICB0ZXN0aW5nU2V0IDwtIHN1YnNldChkYXRhX3JmW3Rlc3RpbmcsIF0pDQogIA0KICAjVHJhaW4gdGhlIG1vZGVsDQogIHJhbmRvbUZvcmVzdE1vZGVsIDwtIHJhbmRvbUZvcmVzdChhZGp1c3RlZF9yZXZlbnVlIH4gcG9wdWxhcml0eSArIHJlbGVhc2VfZGF0ZSArIA0KICAgIHJ1bnRpbWUgKyB2b3RlX2F2ZXJhZ2UgKyB2b3RlX2NvdW50ICsgbWV0ZXJTY29yZSArIG1ldGVyQ2xhc3MgKyANCiAgICBpc19pbl9jb2xsZWN0aW9uICsgaGFzX3RhZ2xpbmUgKyBudW1iZXJfb2ZfY2FzdCArIGZlbWFsZV9jYXN0ICsgDQogICAgbWFsZV9jYXN0ICsgdW5zcGVjaWZpZWRfY2FzdCArIG51bWJlcl9vZl9jcmV3ICsgZmVtYWxlX2NyZXcgKyANCiAgICBtYWxlX2NyZXcgKyB1bnNwZWNpZmllZF9jcmV3ICsgY29tZWR5ICsgaG9ycm9yICsgYWN0aW9uICsgDQogICAgZHJhbWEgKyBkb2N1bWVudGFyeSArIHRocmlsbGVyICsgYW5pbWF0aW9uICsgYWR2ZW50dXJlICsgDQogICAgcm9tYW5jZSArIGZhbWlseSArIHR3ZW50aWV0aF9jZW50dXJ5ICsgd2FybmVyX2Jyb3MgKyBjb2x1bWJpYSArIA0KICAgIHdhbHRfZGlzbmV5ICsgcHJvZF9zaXplICsgbnVtX3Byb2R1Y3Rpb25fY29tcGFuaWVzICsgcHJvZHVjdGlvbl9jb3VudHJ5ICsgDQogICAgYWRqdXN0ZWRfYnVkZ2V0LCBkYXRhPXRyYWluaW5nU2V0LCBudHJlZT0yMDApDQogIA0KICAjQ2FsY3VsYXRlIHRoZSBlcnJvcg0KICBwcmVkaWN0aW9ucyA8LSBwcmVkaWN0KHJhbmRvbUZvcmVzdE1vZGVsLCB0ZXN0aW5nU2V0KQ0KICBlcnJvciA9IHByZWRpY3Rpb25zIC0gdGVzdGluZ1NldCRhZGp1c3RlZF9yZXZlbnVlDQogIG1zZSA9IG1lYW4oZXJyb3JeMikNCiAgcm1zZV9yZl9ib3J1dGFbaV0gPC0gc3FydChtc2UpDQogIGVycm9ycGN0IDwtICgoYWJzKHRlc3RpbmdTZXQkYWRqdXN0ZWRfcmV2ZW51ZSAtIHByZWRpY3Rpb25zKSkvdGVzdGluZ1NldCRhZGp1c3RlZF9yZXZlbnVlKQ0KICBtYXBlX3JmX2JvcnV0YVtpXSA8LSBtZWFuKGVycm9ycGN0KQ0KfQ0KDQpgYGANCg0KDQpgYGB7cn0NCnJtc2VfcmZfYm9ydXRhDQptYXBlX3JmX2JvcnV0YQ0KYGBgDQpXZSBpbml0aWFsbHkgc2VlIHRoYXQgUk1TRSBpcyByZWFsbHkgbGFyZ2UgJDUwTSByYW5nZSBidXQgd2Uga25vdyB0aGF0IHRoZSBlcnJvciBpcyBzaWduaWZpY2FudGx5IGV4YWdnZXJhdGVkIGFzIHRoZSBlcnJvcnMgYXJlIGFscmVhZHkgYmlnLCBwbHVzIHRoZXJlIGFyZSAyMEsgZGF0YSBwb2ludHMgc28gaXQgaXMgbm90IGEgZ29vZCBtZWFzdXJlIG9mIGVycm9yLiANCkZvciB0aGlzIG1vZGVsIHdlIHdlcmUgb25seSBhYmxlIHRvIHJ1biBpdCAzIHRpbWVzIGl0IHRha2VzIGEgdmVyeSBsb25nIHRpbWUgZm9yIGVhY2ggcnVuLiBTbyB0aGUgZXJyb3IgaXMgaGFyZCB0byB0YWtlIGF0IGl0cyBjdXJyZW50IHZhbHVlLiBUaGUgaW50ZXJlc3RpbmcgcGFydCBpcyB0aGF0IHRoZSBSTVNFIGRlY3JlYXNlZCBzaWduaWZpY2FudGx5IHdoaWxlIHRoZSBNQVBFIGluY3JlYXNlZCBzaWduaWZpY2FudGx5LiBUaGlzIGlzIGxpa2VseSBiZWNhdXNlIGFsbCB0aGUgTkEncyB0aGF0IHdlIGRyb3BwZWQgaW4gdGhlIGZpcnN0IGRhdGEgc2V0IHJlc3VsdGVkIGluIHRoZSBtb3ZpZXMgdGhhdCB3ZXJlIGxlZnQgdG8gaGF2ZSBoaWdoZXIgcmV2ZW51ZS4gVGhpcyBtYWtlcyBzZW5zZSBhcyBtb3ZpZXMgdGhhdCBhcmUgYmlnZ2VyIChtb3JlIGJveG9mZmljZSkgYXJlIGdlbmVyYWxseSBtb3JlIHdlbGwgZG9jdW1lbnRlZC4gVGhpcyBjYW4gYmUgY29uZmlybWVkIGJ5IHRha2luZyB0aGUgYXZlcmFnZSBvZiBib3RoIHJldmVudWUgY29sdW1ucyANCg0KDQpgYGB7cn0NCm1lYW4oZGF0YV9uYSRhZGp1c3RlZF9yZXZlbnVlKQ0KbWVhbihkYXRhX3JmJGFkanVzdGVkX3JldmVudWUpDQpgYGANClRoZSBoeXBvdGhlc2lzIGlzIGNvbmZpcm1lZCBhbmQgdGhlIGltcHV0ZWQgZGF0YSBoYXMgNCB0aW1lcyB0aGUgZGF0YSBwb2ludHMgYmUgdGhlIGF2ZXJhZ2UgcmV2ZW51ZSBkZWNyZWFzZXMgc2lnbmlmaWNhbnRseSANCg0KYGBge3J9DQptZWFuKHJtc2VfcmZfYm9ydXRhKQ0KbWVhbihtYXBlX3JmX2JvcnV0YSkNCmBgYA0KV2UgdG9vayB0aGUgYXZlcmFnZSB0byBjb21wYXJlIHRvIHRoZSBOQSB2YWx1ZXMsIGFuZCB0aGUgTUFQRSBpcyB+Mi41eCB3b3JzZSB3aGljaCBpcyBub3QgZ29vZC4gVGhpcyBjb3VsZCBiZSBiZWNhdXNlIHRoZXJlIGFyZSBlcnJvcnMgdGhhdCBoYXBwZW4gd2hlbiBpbXB1dGluZyBkYXRhIGFuZCB0aGVuIHVzaW5nIGltcHV0ZWQgZGF0YSB0byBwcmVkaWN0IGFkZHMgdG8gdGhlIGVycm9yLiBUaGlzIGxpa2VseSBsZWFkcyB1cyB0byB0aGUgY29uY2x1c2lvbiB0aGF0IGltcHV0aW5nIHRoZSBkYXRhIGRvZXMgbm90IGFsd2F5cyByZXN1bHQgaW4gYmV0dGVyIGVycm9yIHJlc3VsdHMsIGF0IGxlYXN0IG5vdCB3aXRoICJzaW1wbGUiIG1vZGVscyBsaWtlIHJhbmRvbSBmb3Jlc3QuIA0KDQpUbyBjb21wYXJlIHRoaXMgbW9kZWwgYWdhaW5zdCB0aGUgbW9kZWxzIGRvbmUgaW4gdGhlIHNlY29uZCBSIG5vdGVib29rIHdlIHdpbGwgdGFrZSBSTVNFLyBhdmVyYWdlIGFkanVzdGVkX3JldmVudWUNCmBgYHtyfQ0KbWVhbihybXNlX3JmX2JvcnV0YSkvbWVhbihkYXRhX3JmJGFkanVzdGVkX3JldmVudWUpDQpgYGANClRoaXMgZXJyb3IgbWV0cmljIGlzIGxlc3MgdGhhbiB0aGUgcmVzdWx0cyBpbiB0aGUgc2Vjb25kIG5vdGVib29rLiBXaGlsZSBpdCBpcyBub3Qgc2lnbmlmaWNhbnRseSBsZXNzIGl0IHNob3dzIHRoYXQgYXQgdGhlIHZlcnkgbGVhc3QgZmVhdHVyZSBzZWxlY3Rpb24gaGFzIGEgcG9zaXRpdmUgaW1wYWN0IG9uIHRoZSBlcnJvci4gDQoNCg0KIyMjIFJhbmRvbSBGb3Jlc3QgTW9kZWwgdXNpbmcgQU5PVkEgRmVhdHVyZXMNCg0KU2ltaWxhciB0byB0aGUgbW9kZWwgYWJvdmUgd2Ugd2lsbCBiZSBydW5uaW5nIGl0IDMgdGltZXMgd2l0aCAyMDAgdHJlZXMuIA0KDQpgYGB7cn0NCnJtc2VfcmZfYW5vdmEgPC0gYygpDQptYXBlX3JmX2Fub3ZhIDwtIGMoKQ0KDQpmb3IoaSBpbiAxOjMpew0KICAjVHJhaW4gdGVzdCBzcGxpdA0KICBudW1fc2FtcGxlcyA9IGRpbShkYXRhX3JmKVsxXQ0KICBzYW1wbGluZy5yYXRlID0gMC44DQogIHRyYWluaW5nIDwtIHNhbXBsZSgxOm51bV9zYW1wbGVzLCBzYW1wbGluZy5yYXRlICogbnVtX3NhbXBsZXMsIHJlcGxhY2U9RkFMU0UpDQogIHRyYWluaW5nU2V0IDwtIHN1YnNldChkYXRhX3JmW3RyYWluaW5nLCBdKQ0KICB0ZXN0aW5nIDwtIHNldGRpZmYoMTpudW1fc2FtcGxlcyx0cmFpbmluZykNCiAgdGVzdGluZ1NldCA8LSBzdWJzZXQoZGF0YV9yZlt0ZXN0aW5nLCBdKQ0KICANCiAgI1RyYWluIHRoZSBtb2RlbA0KICByYW5kb21Gb3Jlc3RNb2RlbCA8LSByYW5kb21Gb3Jlc3QoYWRqdXN0ZWRfcmV2ZW51ZSB+IC4gLSBtZXRlckNsYXNzIC0gbWV0ZXJTY29yZSAtIHVuc3BlY2lmaWVkX2Nhc3QgLSB1bnNwZWNpZmllZF9jcmV3IC0gYWN0aW9uIC0gZG9jdW1lbnRhcnkgLSBteXN0ZXJ5IC0gd2FyIC0gaGlzdG9yeSAtIHR2X21vdmllIC0gZm9yZWlnbiAtIG1nbSwgZGF0YT10cmFpbmluZ1NldCwgbnRyZWU9MjAwKQ0KICANCiAgI0NhbGN1YWx0ZSB0aGUgZXJyb3INCiAgcHJlZGljdGlvbnMgPC0gcHJlZGljdChyYW5kb21Gb3Jlc3RNb2RlbCwgdGVzdGluZ1NldCkNCiAgZXJyb3IgPSBwcmVkaWN0aW9ucyAtIHRlc3RpbmdTZXQkYWRqdXN0ZWRfcmV2ZW51ZQ0KICBtc2UgPSBtZWFuKGVycm9yXjIpDQogIHJtc2VfcmZfYW5vdmFbaV0gPC0gc3FydChtc2UpDQogIGVycm9ycGN0IDwtICgoYWJzKHRlc3RpbmdTZXQkYWRqdXN0ZWRfcmV2ZW51ZSAtIHByZWRpY3Rpb25zKSkvdGVzdGluZ1NldCRhZGp1c3RlZF9yZXZlbnVlKQ0KICBtYXBlX3JmX2Fub3ZhW2ldIDwtIG1lYW4oZXJyb3JwY3QpDQogIA0KfQ0KYGBgDQoNCg0KYGBge3J9DQpybXNlX3JmX2Fub3ZhDQptYXBlX3JmX2Fub3ZhDQpgYGANClRoZSBSTVNFIGFuZCBNQVBFIGlzIGxvd2VyIGZvciB0aGUgQU5PVkEgc2VsZWN0aW9uIHRoZW4gdGhlIGJvcnV0YSBzZWxlY3Rpb24gYWdhaW4gdGhpcyBsZWFkcyB1cyB0byBiZWxpZXZlIHRoYXQgbWF5YmUgaW4gdGhpcyBjYXNlIEFOT1ZBIHNlbGVjdGlvbiBpcyBhY3R1YWxseSBiZXR0ZXIgdGhhbiB0aGUgYm9ydXRhIHNlbGVjdGlvbi4gYnV0IG9uY2UgYWdhaW4gd2UgZG8gbm90IGFjdHVhbGx5IGtub3cgdGhpcyBiZWNhdXNlIHdlIGRpZCBub3QgcnVuIGl0IG9uIHRoZSBzYW1lIHRyYWluL3Rlc3Qgc3BsaXQuDQoNCg0KYGBge3J9DQptZWFuKHJtc2VfcmZfYW5vdmEpDQptZWFuKG1hcGVfcmZfYW5vdmEpDQpgYGANClRvIGNvbXBhcmUgdGhpcyBtb2RlbCBhZ2FpbnN0IHRoZSBtb2RlbHMgZG9uZSBpbiB0aGUgc2Vjb25kIFIgbm90ZWJvb2sgd2Ugd2lsbCB0YWtlIFJNU0UvIGF2ZXJhZ2UgYWRqdXN0ZWRfcmV2ZW51ZQ0KYGBge3J9DQptZWFuKHJtc2VfcmZfYW5vdmEpL21lYW4oZGF0YV9yZiRhZGp1c3RlZF9yZXZlbnVlKQ0KYGBgDQoNCk9uY2UgYWdhaW4gdGhpcyBlcnJvciBtZXRyaWMgaXMgbGVzcyB0aGFuIHRoZSByZXN1bHRzIGluIHRoZSBzZWNvbmQgbm90ZWJvb2sgDQoNCg0KIyMjIE5leHQgU3RlcHMNCg0KQXMgd2UgZG8gbm90IGhhdmUgYSBsb3Qgb2YgY29tcHV0YXRpb25hbCBwb3dlciB3ZSB3ZXJlIG5vdCBhYmxlIHRvIHJ1biBhIG5ldXJhbCBuZXR3b3JrIG1vZGVsLiBJdCB3YXMgYWxyZWFkeSB0YWtpbmcgdGhlIGVudGlyZSBuaWdodCB0byBydW4gdGhlIHJhbmRvbSBmb3Jlc3QgbW9kZWwgc28gbmV1cmFsIG5ldCBtb2RlbHMgd291bGQgaGF2ZSB0YWtlbiBtdWNoIGxvbmdlci4gV2UgYmVsaWV2ZSB0aGF0IGlmIHdlIGFyZSBhYmxlIHRvIGZpbmQgdGhlIGNvcnJlY3Qgc3RydWN0dXJlIHRvIGEgbmV1cmFsIG5ldCBtb2RlbCB0aGVuIGl0IHdvdWxkIHBlcmZvcm0gYmV0dGVyIG9uIHRoZSBpbXB1dGVkIGRhdGEgY29tcGFyZWQgdG8gdGhlIGRhdGEgd2hlcmUgd2UgZHJvcCB0aGUgbmEncy4gV2l0aCA0IHRpbWVzIHRoZSBhbW91bnQgb2YgdHJhaW5pbmcgZGF0YSB3ZSBhcmUgYWJsZSB0byBiZXR0ZXIgdHJhaW4gdGhlIG1vZGVsIGFuZCBkZWNyZWFzZSBNQVBFLiANCg0KQW5vdGhlciB0aGluZyB0aGF0IHdlIHdlcmUgbm90IGFibGUgdG8gZG8gaW4gdGhpcyBwcm9qZWN0IGlzIHNvbWUgc29ydCBvZiBvcHRpbWl6YXRpb24gd2hlcmUgd2UgZGV0ZXJtaW5lIHRoZSBjb3JyZWN0IGNvbWJpbmF0aW9uIG9mIHZhcmlhYmxlcyB0byBvcHRpbWl6ZSByZXZlbnVlLiANCg0KQW5vdGhlciBpbnRlcmVzdGluZyBwcm9ibGVtIHRoYXQgd2UgYXJlIGZhY2luZyBpcyB0aGF0IHRoZSBNQVBFIGlzIGRyYXN0aWNhbGx5IGRpZmZlcmVudCBiYXNlZCBvbiB0aGUgdHJhaW4vdGVzdCBzcGxpdCBhbmQgd2Ugd2FudGVkIHRvIGxlYXJuIHdoYXQgY291bGQgYmUgdGhlIHJlYXNvbiBmb3IgdGhpcyBhbmQgaG93IGNhbiB3ZSBmaXggaXQuIA0KDQoNCg0KDQo=